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Gene expression cassette and its use 

The present invention relates to a gene expression cassette and in particular to the use of the 
5 cassette in methods for presenting polypeptides on the surface of bacterial cells and/or 

secreting them into the surroundings of the latter. The invention further relates to gene 

expression constructs that are used to transform bacterial host cells. Uses of the invention 

include immunisation, in particular mucosal immunisation, induction of immunological 

tolerance and anti-tumour therapy in humans and animals. The intended vaccines and anti- 
1 0 cancer agents will also make use of bacterial spores produced by Clostridia, e.g. Clostridium 

difficile, for both industrial production of the vaccine and for local production of fee desired 

polypeptides at the body sites desired. 

Vaccines against infection represent the greatest advance in medicine with 

unparalleled impact on morbidity and mortality at relatively low cost. Despite their cost- 
1 5 effectiveness, the cost associated with modem vaccines is still of concern and limits their use, 

particularly in developing countries. A number of factors contribute to the cost of injectable 

vaccines including the requirements for vaccines with defined sub-cellular components, for 
" purity and sterility of the vaccine preparations, for testing of administration routes and 

combinations with different adjuvants, for maintaining the cold chain in distribution of the 
20 vaccines, and for using sterile syringes and needles. The need for repeated vaccinations also 

cpntributes to increased costs. Furthermore, for many infectious diseases a vaccine has not yet 

been made available 

Also, in recent years great interest has been shown in mucosal immunisation, i.e. the 
exposure of mucosal surfaces to an antigen to elicit a general humoral and mucosal immune 

25 response, i.e. also at distant sites (Mucosal Immunology, Ed. P.L.Ogra et al., Acad. Press, 

1999). Although this field is in its infancy, two such products are so far on the market, namely 
the oral polio vaccine and an oral (drinkable) vaccine against cholera and diarrhea due to 
Escherichia coll The latter is an inactivated vaccine containing killed Vibrio cholerae 
organisms plus the cholera toxin subunit B which is non-toxic, immunogenic and shared 

30 between the cholera toxin and the toxin of "enterotoxigenic E. col?* (ETEC), the main cause 
of "travellers diarrhea". It is hoped that mucosal immunisation via for example intranasal or 
peroral administration of the antigen will provide a real alternative to injectable vaccines. As 
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most microbes invade humans and anirrials via mucous membranes we anticipate that the 
mucosal route will turn out to be the superior alternative for vaccination in many instances. 

The scientific consensus at this point appears to be that live vaccines are potentially 
superior candidates for single-dose long lasting vaccination, because the carrier organism will 

5 continue to produce the antigen and boost immunity in vivo. However, experience with the 
bacterial carriers currently studied, for example the intestinal bacteria Salmonella and E. coli 
is that they are not generally classified as safe and are further difficult to distribute. Therefore, 
probiotic organisms such as Lactobacilli have been proposed as suitable carriers of foreign 
antigens. Problems have been identified, these include limited shelf-life unless freeze-dried 

1 0 which reduces viability, and difficulties in colonising the gut of the human recipient and 
evoking an immune response. 

A classical way of enhancing the immunogenicity of vaccines is co-administration of 
so-called adjuvants. These may be molecules such as aluminium hydroxide or lipid vesicles 
that increase the exposure time for the vaccine by slowing its removal from the site of 

1 5 injection or "danger molecules" of microbial origin that increase the immune response in a 
non-specific way. Thus, in recent years it has been found that adjuvants also act by evoking 
production of immunomodulatory peptides called cytokines and chemokines (Brewer JM, 
Alexander J, Cytokines Cell Mol Ther 4:233-246, 1997. Ulanova M, Classical and non- 
classical antigen-presenting molecules in immune responsiveness, Thesis, Goteborg 

20 University, Sweden, 2000, ISBN 91-628-4228-5). Due to possible adverse effects the use of 
cytokines themselves as adjuvants must await a better understanding of optimal selection and 
dosing of these molecules together with vaccines. It is likely that applying cytokine adjuvants 
by the mucosal route is less critical than parenteral administration from a side effect point of 
view. 

25 Interest has also been shown in the extracellular presentation of foreign protein 

antigens by bacteria, included as part of bacterial surface layer proteins. A surface layer (S- 
layer) protein is herein defined as any molecule of proteinaceous nature, including e.g. 
protein, glyco- or lipoprotein occurring in the outer layer of a bacterium and capable of being 
exposed on the surface of the bacterium. S-layer proteins are a main constituent of the cell 

30 wall of some gram-positive bacterial genera. They may be continuously and spontaneously 

produced in larger amounts than any other class of protein in the cell. WO-95/19371 describes 
a fusion protein of at least a part of a S- layer protein and a heterologous peptide, the intention 
being that the polypeptide is expressed and presented on the surface of the cell. A range of 
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bacterial hosts is mentioned including Staphylococcus, Streptococcus, Bacillus, Clostridium 
and Listeria. A preference for Bacillus is stated and the examples use B. sphaericus. 
Elsewhere, WO-97/28263 describes processes for the recombinant preparation of S-layer 
proteins in gram-negative host cells^ It is suggested that these proteins could include antigenic 
5 species. FR-A-2778922 describes the use of genes which regulate the synthesis of toxin 
products in Clostridium bacteria, to produce polypeptides. 

We first set out to investigate toxin release and to identify other extracellular proteins 
produced by Clostridium difficile (C. difficile). C. difficile is an anaerobic spore forming 
pathogen causing C. difficile associated diarrhea (CD AD) and pseudomembranous colitis 

1 0 (PMC) by producing two toxins, A and B . 

In contrast to studies by Kamiya et al (J.Med.Microbiol.,1992, 37, 206-210) and 
Ketley et al (J.Med.Microbiol.,1984, 18, 385-391) we found that the accumulation of 
extracellular toxins is not accompanied by cell lysis suggesting a toxin export mechanism. 
Our first detailed analysis (described in Example 9 below) of proteins occurring 

1 5 extracellularly as well as in the cell wall and membrane fraction of strains VPI 10643 and 630 
and analysis of the DNA/genes encoding these proteins revealed a genomic segment 
containing eighteen genes (Figure 1 A). Seven proteins (ORFl,3,5-7,9 and 1 1), when 
compared with publicly available sequence showed some homology to N-acetyl muramoyl L- 
alanine amidase (CwlB/LytC) and modifier protein of major autolysin (LytB) from B. subtilis, 

20 as well as S- layer proteins from Lactobacillus spp (Tables 1 and 4). The amidase motif was 
located either at the C-terminal or the N-tenninal end of the S-layer protein ORFs (see 
examples in Figure IB). Other ORFs showed similarity to genes involved in polypeptide 
secretion (OKF2/secA), polysaccharide and capsule synthesis, and possibly glucosylation of 
the S-layer proteins. Further database searches indicated that the amidase motif confers 

25 anchorage of the S-layer proteins to the clostridial cell wall peptidoglycan-teichoic acid. 

A search in the revised C. difficile database revealed five additional genes upstream of 
ORF1 which had similarities to the previously found ones, i.e. they had a two-domain 
architecture one showing homology to the CwlB/LytC and LytB proteins. These ORFs thus 
had the putative cell wall binding amidase motif typical of the other S-layer ORFs and were 

30 designated D, E, G, H and I by us (Fig. 2 and Table 1). 

Significantly, we found that the N-termini of all S-layer ORFs contained a typical 
signal peptide for export via sec-dependent secretion and that for e.g. ORF1 the predicted 
signal peptide cleavage site (Figure 3) was identical to that found in the protein sequence . 
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Furthermore, the secreted ORF1 product was further cleaved into two peptides in strain 630, 
the C-terminal one containing the N-acetyl muramoyl L-alanine amidase like sequence 
(Figure IB and Example 9). Although we identified the cleavage point, the kinetics and 
precise mechanism of this proteolytic event remains unknown. It is likely that the S-layer 
5 protein cleavage product containing the amidase motif provides cell-wall binding, whereas the 
other peptide showing more sequence variability is more surface exposed and providing 
antigen variation between strains (serotypes). We found no significant match between 
sequences within ORF1 and the S-layer homology motif (SLH domain) found in most 
presently known S-layer proteins although a weak similarity to an S-layer protein from 

1 0 Lactobacillus helveticus was found for the N-terminal part of ORF1 (Figure IB). Also, apart 
from the shared amidase motif, we found no significant homology beetween the different S- 
layer ORFs that could suggest how the variable cleavage product(s) get anchored to the cell- 
wall binding peptide (inner S-layer) or to each other to form the outermost layer recently 
distinguished by electron microscopy (Cerquetti M et al. Characterization of surface layer 

1 5 proteins from different C. difficile clinical isolates. Microb Pathogenesis 28: 363-372, 2000). 
Our subsequent analysis of 21 C difficile serogroup type strains, probably 
representing all major genetic lineages of the species, indicated that a gene segment 
corresponding to ORF1 and its upstream region plus ORF 2 is generally present in C. difficile 
(Example 1C). 

20 The N-terminal part of ORF6 showed homology to eukaryotic. cysteine proteases (Fig.. 

IB). ORFS has been suggested to be involved in adhesion to epithelial cells (Abstract; The 
Third International Meeting on the Molecular Genetics and Pathogenesis of the Clostridia, 
June 8-11, 2000, Chiba, Japan). 

The present invention is based, at least in part, on the above discoveries. We have 

25 identified and developed a polypeptide expression and secretion system that may be used to 
produce a desired polypeptide on the surface of and/or into the surroundings of bacteria, for 
introduction into an appropriate mammal. The system may be used for example to initiate 
mucosal vaccination. A particular advantage of the system is that it may be used with any 
convenient Clostridium species, independently of any normal S-layer protein production. 

30 Furthermore, in case of C. difficile it is possible to use strains lacking the 5-gene toxicity 

cassette encoding the two major virulence factors toxins A and B and thus avoid the risk of 
CD AD when administering the engineered peptide producing strains to humans aged 2-4 
years or more (neonates and small children are insensitive to the toxins) or animals. 
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Therefore, in a first aspect of the invention we provide a gene expression cassette 
comprising a secretory leader sequence selected from any one of ORF1, ORF3, ORF5-7, 
ORP9 or ORF1 1 (SEQ ID NO: 1 -7) (c.f. Figure 1 and Table 1) of C. difficile strain 630 
linked to a DNA sequence encoding a heterologous polypeptide. Alternatively, the secretory 
leader sequence is from any one of ORF D, E, G, H and I (SEQ ID NO: 8 - 12) (cf. Figure 2 
and Table 1) or from any analogous S-layer ORF taken from any C. difficile strain. 

By "heterologous" we mean a nucleic acid sequence or protein not native to the 
clostridial strain being used. 

Use of each of the secretory leader sequences mentioned above represents a separate 
and independent aspect of the invention. The secretory leader sequence is preferably from 
ORF1. 

A recent publication by Karjalainen et al, Infection and Immunity, May 2001, p3442- 
3446 provides confirmation and analysis of most of the ORFD - ORFI - ORF1 - ORF 11 S- 
layer gene cluster in C. difficile strain 630. The nucleotide and polypeptide sequences 
disclosed by Karjalainen et al are incorporated herein by reference. 

In a further aspect of the invention the gene expression cassette further includes a 
promoter of prokaryotic origin. The promoter is preferably a strong promoter and in general ■ 
is placed 5' of the secretory leader sequence in the gene cassette. 

In a further aspect of the invention the gene expression cassette further includes a 
DNA sequence encoding at least a functional portion of an S-layer protein of C. difficile fused 
to a nucleic acid coding sequence coding for a heterologous polypeptide such that the 
resulting fusion polypeptide will be expressed and presented on the outer surface of the host 
cell harbouring the cassette. If desired the polypeptide can also be released from the bacteria, e 
g. by excluding the S-layer amidase motif from the construct (cf. Figure. 3). 

In a further aspect of the invention the engineered gene expression cassette optionally 
further comprises at least a functional part of the secretory (secA) gene represented by ORF2. 
This may be used to complement or replace the function of the normal C. difficile sec gene in 
order to ensure efficient translocation of the peptide(s) produced by the cassette across the 
cytoplasmic membrane. 

An example of a preferred gene expression cassette is conveniently illustrated in 

Figure 3. 

The promoter in the gene expression cassette is conveniently a strong promoter, mis 
may be the native promoter for ORFs 1 - 12 of C. difficile of strain 630 (Table 1). 
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Alternatively, the promoter sequence is from any one of ORF D - 1 (cf. Table 1), alternatively 
from any other analogous S-layer ORF from a C difficile strain or from another gene, 
preferably from this species (see Specific description Bl). The promoter may thus be another 
prokaryotic promoter that is strong, inducible or constitutive, and functional in the 
5 polypeptide producing bacterium. In all potential applications a distinct advantage of this 
cassette is the very large amounts of protein produced and exported. 

The gene expression cassette is conveniently placed in a vector or specifically a 
plasmid carrying a transposon belonging to for example the Tn916, Tn5387 or the Tn5398 
families. After transfection of a C. difficile host organism these transposons are able to insert 

1 0 themselves into its chromosome thereby making the engineered cassette a stable trait of the 
bacterium (cf. Figure 4). For other Clostridia other vectors may be preferable, e.g. the 
engineered shuttle plasmid pJIR750. Unlike the C. difficile plasmids currently available, this 
vector can replicate within both an E. coli and a C perfringens host and is not dependent on 
integration of the plasmid into the host chromosome. Any convenient Clostridium species 

1 5 may be used, to date over 70 species have been defined by rRNA sequence analysis. These 
include C. difficile and classical pathogens as C perfringens, C. tetoni and C. botulinum, also 
C acetobutylicum that is being genetically manipulated and used for industrial production of 
acetic acid and C. beijerinckii that has been transformed with E. coli genes. 

C perfringens is currently the species most amenable to genetic engineering. It is a 

20 normal, moderate level, fecal coloniser of most, if not all, humans. C difficile is found in the 
fecal flora of most newborns, less often in adults but commonly in hospitalized individuals. 
As C difficile is an early, normally colonising intestinal organism and even toxigenic strains 
are unable to cause CD AD in newborns and infants up to 2-4 years of age, we believe that 
recombinant C. difficile producing desired antigens and adjuvants is suitable for oral 

25 vaccination at any convenient time after birth. 

Whereas C. perfringens normally produces many toxins about half of wild C difficile 
strains are genetically non-toxigenic, which maybe an advantage from a safety point of view. 
C. difficile toxin negative strains are preferred as host cells for the gene expression cassettes of 
this invention, at least for individuals aged 2-4 years or more (see above). 

30 The nucleic acid sequence coding for a heterologous polypeptide is placed in the gene 

expression cassette before or after insertion into a convenient vector or plasmid. The insertion 
points for the nucleic acid sequence are at the discretion of the skilled scientist, there may be 
in the variable or in the constant region of the relevant ORF nucleotide sequence. Routine 
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experimentation may be used to determine convenient and particular insertion points. In 
Figure 3 we disclose polypeptide cleavage sites that need to be taken into consideration (See 
Specific description B3). 

Examples of convenient plasmids include those mentioned in Figure 4, for example 
5 pCI195 and pSMB47. Convenient transposons include those belonging to the Tn916, Tn 
5397 and Tn5398 families for transfection into C. difficile and for example pJ!R750 for C. 
perfingens or other Clostridia. Any convenient heterologous nucleic acid sequence may be 
placed into the gene expression cassette. In a further aspect of the invention we provide a 
vector or plasmid comprising a gene cassette of the invention. 

1 0 The vector or plasmid may then be transfected into a convenient host using techniques 

known in the art (see for example: Gene 82: 327-333, 1989). For C. difficile it is at present 
preferred to introduce the plasmid into a Bacillus species such as B. subtilis and then transfer 
the target DNA by filter mating (conjugation) into a convenient C. difficile strain (outlined in 
Fig. 4). This will generally require the use of a conjugative transposon-bearing plasmid such 

1 5 as pCI195 or pSMB47 (J. Antimicrob. Chemother. 35: 305-315, 1995; FEMS Microbiol. 
Lett., 168: 259-268, 1998; D. Lyras, J. I. Rood, Clostridial genetics, in Gram-positive 
pathogens, ed. V. A. Fischetti, Am. Soc. Microbiol, 2000). However, we anticipate that 
further materials and procedures will become available for the direct introduction of plasmids 
or other foreign DNAs into Clostridia and particularly C. difficile. For e.g. C. perfringens 

20 such vectors and techniques are to some extent already, available. 

In a further aspect of the invention we provide a Clostridial bacterium transformed 
with a gene expression cassette of the invention encoding the desired fusion peptide or 
entirely heterologous polypeptide(s). 

The transformed Clostridial bacterium, when administered orally to any convenient 

25 mammal such as a human or animal will lead to the intestinal colonization, production and 

presentation of the desired polypeptide particularly in the large bowel that is the natural site of 
colonization of C. difficile. The bowel wall is surrounded by an immense immune apparatus, 
the so-called Peyer's patches and thus, specialized in mounting immune responses of various 
types. Large bowel colonization by a clostridial vaccine or peptide producer strain thus 

30 enables a much longer immune stimulus than a traditional injection. In contrast to Clostridia, 
the alternative and much studied S-layer producers for vaccine purposes, Bacillus spp, are 
free-living, obligate aerobic bacteria and unable to replicate in the anaerobic bowel lumen and 
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thus, unable to colonize a recipient mammal For clostridial colonization and peptide delivery 
in hypoxic tissues iv administration is used. 

It will be appreciated that Clostridia carrying the gene expression cassette of this 
invention including DNA encoding different heterologous peptides allows the highly efficient 
production and export of these polypeptides in hypoxic tissues after iv administration, or into 
the gut, particularly the colon, of the orally colonized individual for a variety of prophylactic 
or therapeutic uses. Another advantage of this gene cassette for expression of heterologous 
peptides is its versatility, i.e. that it is normally used to produce and export peptides of varying 
size and having completely different amino acid sequences, in their N-terminal or C-terminal 
end- 
In further independent aspects of the invention the recombinant gene expression 
cassette is used to produce in the gut, for example 

(i) peptides and enzymes for therapy and prophylaxis of various diseases, e.g. peptides 
having specific antimicrobial activity, cytokines against inflammatory bowel disease, 
and p-lactamases to prevent diarrhea due to antibiotic therapy 

(ii) single, fusion or multiple polypeptide antigens of microbial, animal or mammalian . 
origin for neonatal immune balancing, vaccination against infections, allergy, 
metabolic or auto-immune disease, cancer, (infertility, and drug addiction. 

(iii) carrier molecules (so-called adjuvants) separate or fused to the antigen in order to . 
amplify or modulate the immune response to the antigen in a desired way according to 
(ii), e.g. a strong IgA response against a mucosal invader. 

In a still further aspect of the invention the gene expression cassette of the invention 
may be used to provide recombinant Clostridia for local production of peptides in tissues after 
iv administration of their spores (see below), for example for the prophylaxis and/or treatment 
of fibrinolysis in arterial or venous occlusion and/or for revitalising gangrenous and/or 
necrotic tissue in various diseases. Furthermore, for anti-tumour therapy by local production 
of 

(i) immune stimulating human peptides for improving tumour host defence, 

(ii) enzymes that convert a pro-drug to a cytostatic agent inside a tumour (thus avoiding 
systemic side effects) 

(iii) cytotoxins of e.g. bacterial origin to destroy tumour cells 
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(iv) angiogenesis inhibitors at local concentrations enough to prevent local blood vessel 
formation and thus, tumour growth 

(v) signal transduction inhibitors. 

In a further aspect of the invention we provide a pharmaceutical or veterinary 
5 composition which comprises a transformed viable Clostridial cell with the ability to present ' 
and/or to secrete the desired polypeptide together with a pharmaceutical^ or veterinary 
acceptable carrier or diluent. 

The composition may be formulated as a vaccine. The composition may be 
administered orally, or intranasally or alternatively, the polypeptide can be isolated, purified 
1 0 and administered parenterally, e.g subcutaneously or intramuscularly. 

The amount of the desired peptide(s) presented and/or secreted by the transformed 
strain may be modulated in the body by using 

(i) promoters with different strength (power), • 

(ii) a promoter or regulator responding to external stimuli (inducible, e.g. by a specific 
1 5 carbohydrate) normally present in the gut or administered together with the engineered . 

bacterium, 

(iii) different dosage regimens (number of bacteria per dose and doses per time period) 

or 

(iv) methods that influence the ability of the strain to colonise and propagate in the gut 
20 for convenient periods of time. Relevant factors include ability to compete with other 

bacteria, adhere to mucosal cells, and to avoid expulsion by local immune response 
mechanisms. The latter can be achieved by exploiting induction of tolerance to the natural S- 
layer antigen of the strain in the neonate (see above and below) or by using its putative normal 
antigenic variation, suggested to us by the presence of the large number of S-layer ORFs 

25 present in C. difficile, either by allowing the vaccine strain to change serotype during natural 
long-term colonization or by repeated applications (colonizations) over time of different 
strains producing the same antigen but having a different serotype antigen. 

The transformed Clostridia as anerobic organisms are conveniently produced by 
fermentation under for example low oxygen tension and purified and recovered as known in 

30 the art for native Clostridia, for example by washing and freeze-drying. They may be 

formulated together with excipients as needed, for example magnesium stearate, lactose, or 
carboxymethyl cellulose, into solid dosage forms, e.g. in capsules, predominantly for oral 
administration. The dosage forms may be protected against the acidity of the stomach by a 
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suitable enteric coating, comprising for example Eudragite "S", Eudragite "L" cellulose 
acetate, cellulose phthalate or hydroxypropyl cellulose. A preferred dosage form comprises 
freeze-dried transformed Clostridia contained in vials or ampoules, optionally under inert gas. 
Preferably, the transformed Clostridia cells are administered orally or intranasally, as an 
5 aqueous, reconstituted suspension of the lyophilized cells e.g in water or physiological saline, 
optionally with addition of pharmaceutically acceptable buffers, e.g. sodium bicarbonate, 
phosphate or citrate to keep the pH of the suspension between 6 and 8, preferably between 6.5 
and 7.5. 

The dosage forms produced as described above may comprise a mixture of viable and 

1 0 non-viable bacteria depending on the process and/or the storage conditions. The viable, 

transformed Clostridia will, after oral administration, become attached to those parts of the 
gut, for example the lower intestinal tract, which provide appropriate growing conditions and 
proliferate, producing the desired polypeptide in increasing amounts. This will provide for an 
enhanced and sustained physiological effect, for example immunisation, of the polypeptide. 

1 5 If exposure to defined amounts of the polypeptide is desired, non-viable transformed 

Clostridia presenting the polypeptide can be administered. The non- viable cells can be 
obtained as known in the art, e.g. by exposing the live cells to agents, e.g. heat, formaldehyde, 
antibiotics or solvents, which kill them. It is also possible to use cell walls (sacculi) or to use 
S-layer fragments obtained by mechanical or other disruption of the bacterial cells. These 

20 agents can be formulated into pharmaceutical and veterinary compositions as described above 
for live transformed Clostridia. 

By "secretion" or "release" we mean that the heterologous polypeptide is exported out 
from the host cell into the surrounding environment as a soluble antigen. This is conveniently 
achieved by fusing the DNA coding for the polypeptide to a DNA sequence coding for a 

25 signal peptide sequence of any one of ORFl,ORF3, ORF5-7, ORF 9 or ORF1 1 (SEQ ID 
NO:l-7, cf. Figure 1 and Tablel) preferably to that of ORF1 and expressing it as described 
above tinder control of a strong promoter and exporting it with the aid of the sec gene (ORF2) 
product. Alternatively, the DNA codes for a signal peptide sequence of any one of ORF D, E, 
G, H or I (SEQ ID NO: 8-12, cf. Figure 2 and Table 1), or that of any other suitable secreted 

30 bacterial protein. 

By "presentation" we mean that the polypeptide is translocated across the cell 
membrane and presented on the surface of the bacterium in a sufficient manner for it to act as, 
for example, a particulate antigen. The DNA coding for the heterologous polypeptide may 
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then for example be fused to a S-layer coding sequence, which codes at least for a functional 
cell wall binding portion of a S-layer protein of C. difficile (Figure 3) and expressed as 
described above to get exposure of the heterologous polypeptide on the outside of the host cell 
and thus, hooked to the amidase motif of the S-layer protein. Alternatively, omitting this motif 
5 from the construct in order to get increased release of the heterologous peptide (see Specific 
description B3). 

The heterologous polypeptide may be a foreign epitope or immunogen giving rise to 
antibodies that protect against disease, we note that many antibodies elicited are not 
protective. It typically comprises an antigenic determinant of a pathogen. The pathogen may 

10 be a virus, bacterium, fungus, yeast or parasite. The antigen may also be a "self' molecule for 
prevention or cure of disease (see below). The heterologous polypeptide may further be an 
antimicrobial peptide, e.g. for elimination of undesired microorganisms, and an anti-tumour 
peptide (see below) or a molecule that changes the immune response of the gut from a 
negative one, such as allergy or auto-immune tissue destruction, to a positive one, such as 

1 5 infection protection (see above). For example, Lactobacillus components are believed to 

prevent allergy development and live lactobacilli are currently given to infants in successful 
trials for prevention of allergy (Bjorksten B, pers coram, and Kalliomald et al. Lancet 357: 
1076-1097, 2001). Cystein proteases such as cathepsin are thought to change the intestinal, 
mucosal response to infection from a Th2 type (disease promoting) to a Thl response 

20 (infection protection). Alternatively, the polypeptide^) may be en2yme(s) that improve 

digestion of food, or that together synthesize a polysaccharide antigen of a microorganism, an 
antibiotic, or a specific vitamin or other nutrient or hormone useful to the host mammal. An 
enzyme produced by the the engineered Clostridial bacterium may also be an antibiotic 
inactivating enzyme, e.g. a beta-lactamase, to be given together with or after the antibiotic for 

25 prevention of CD AD or non-specific antibiotic induced diarrhoea, common problems in 
hospitals today. 

The heterologous polypeptide may also be a part of an antibody molecule. This may 
comprise the constant part in order for example to obtain an enhanced non-specific immune 
response or the response to a co-adnunistered antigen (adjuvant effect). Alternatively, it may 
30 be the variable part directed against any surface or secreted component of a microorganism 
(toxin, antigen, adhesin) in order to prevent its ability to colonize and cause intestinal disease. 
The expression product of the cassette of the invention may also represent the immune 
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stimulating part of allergy causing antigens lacking their IgE interacting part, thus evoking an 
antibody response but avoiding an allergic reaction (anti-allergy vaccination). 

Whether the heterologous polypeptide is to be provided alone or fused with a carrier 
peptide, or presented cell-bound, released or both depends on its desired function. For 
5 example, for a polypeptide acting as an enzyme, free "secreted" molecules may be most 

effective, whereas in case of vaccination an antigen fused to a carrier peptide or being a cell- 
bound ("presented") polypeptide on a bacterium, strongly adhering to or being phagocytosed 
by the gut mucosa, may give the best mucosal immune response. 

The immune response to a heterologous peptide may be increased by fusion to the 

1 0 repeating C-terminal sequences encoding the non-toxic motifs of the G difficile toxins A and 
B that enable these to enter the colonic mucosal cells by receptor-mediated endocytosis, 
and/or to a portion of toxin B responsible for intracellular and intercellular spread of the 
antigen (see Barth et al below). Thus, by using adhering clostridial bacteria producing a 
desired heterologous peptide antigen fused to non-toxic parts of the C difficile toxins, the 

1 5 mucosal immune reponse may be boosted (adjuvant effect). 

A further improved immune response may be obtained by exploring the natural S- 
layer proteins of C difficile that seems to anchor the organsim to the mucosa. It is likely that 
the amidase like fragments are directed inwards to provide cell wall anchorage, whereas the 
sequence unique fragments represent the the outermost portion of the S-layer protein serving 

20 as surface antigen (see above) and probably also as adhesin by which C. difficile attaches to 
the mucosal cell surface as recently suggested by Waligora et al. (Infect Imni 69:2144-2153, 
2001). Thus, by switching between expression of its different S-layer ORFs over time each C. 
difficile strain may achieve surface antigen variation and thus, immune evasion and prolonged 
colonization in the gut However, after deliberate colonization of a newborn with a certain 

25 Cdifficile strain, particularly if expressing or being administered together with a "danger 
molecule", tolerance (see above and below) to its S-layer serotype protein may be obtained 
enabling the use of the same C difficile serotype for efficient deliberate long-term 
colonization of this individual, e.g. for vaccination purposes, also during later periods of life. 
In a further aspect a carrier peptide or adjuvant, e.g. a "danger molecule" is used in 

30 addition to the desired heterologous polypeptide, administered or produced in vivo either 
as a separate molecule or fused to the principal (antigenic) polypeptide. This is in order to 
amplify desired specific immune responses for prevention or therapy of infection, or in the 
neonate also for shifting its general response towards anti-infection and tolerance of "self * and 
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therefore away from allergy and autoimmunity (see above and below). The "danger 
molecule" or adjuvant is a species that may stay in a human or animal body for a long time, 
such as up to one, three, six months or up to one year. Alternatively, or in addition, this 
species is capable of eliciting a stronger immune response that the desired heterologous 
5 polypeptide acting alone. 'Danger molecules" are often of microbial origin, rapidly 

recognized and strongly reacted upon both by the innate/primitive and the trained/specialized 
immune system (see above). A convenient reference for this aspect of the invention is the 
thesis by Carola Rask of the Department of Medical Microbiology and Immunology at 
Goteborg University, Sweden, ISBN 91-628-4497-0 "Cholera toxin B subunit as a carrier for 

1 0 inducing mucosal immunity and/or peripheral tolerance." 

By using a mixture of different genetically engineered Clostridia strains, presenting 
and/or secreting different heterologous polypeptides, which act additively or synergistically, it 
is possible to achieve an enhanced physiological response. Thus oral immunization with a 
mixture of genetically engineered Clostridia strains, each presenting and/or secreting a 

1 5 different polypeptide derived from the same or another microbial pathogen, will provide both 
a broader immune response and a so called adjuvant effect, i.e. a more complete immune 
response and a better vaccination against the pathogen than obtained by using just one strain 
expressing a single immunogenic epitope of the pathogen. 

In a further aspect of the invention we provide a medicament or therapeutic agent 

20 which comprises a Clostridial bacterium transformed with a gene cassette of the invention and 
capable of presenting on the surface of the bacterium and/or secreting a polypeptide in a 
human or animal body. 

The medicament or therapeutic agent is conveniently a lyophilised powder for 
reconstitution as a suspension or for production of a solid pharmaceutical form such as a 

25 capsule or a tablet. The therapeutic agent can be administered orally or intranasally. For oral 
administration capsule or tablet formulations may be used. To protect the compositions 
against the acidity of the stomach buffer substances, e.g. sodium bicarbonate, may be used 
and/or the formulations may be covered with enteric coatings, e.g. Eudragite "S" or "L", 
cellulose acetate, cellulose phthalate or hydroxypropyl cellulose. A convenient way for oral 

30 administration of the therapeutic agents is to provide them as lyophilised powders, and shortly 
before administration to suspend these in for example water, fruit juice or physiological 
saline, optionally with addition of sodium bicarbonate or neutral citrate, or phosphate buffer to 
protect against the acidity of the stomach. Any convenient dose may be used, this may be in 
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the range from 1 to 10 u bacteria, more conveniently we anticipate this to be in the range from 
about 10 3 to about 10 9 bacteria. 

A principal use of the invention is in vaccination. Therefore in a further aspect we 
provide a vaccine which comprises a Clostridial bacterium transformed by a gene cassette of 

5 the invention and capable of secreting and/or presenting an antigen on the surface of the 
bacterium in a human or animal body. 

In addition to improving the existing anti-infection vaccines and creating new ones, 
there is also current interest in several novel uses for vaccines. 

Allersv One strategy is engineered anti-allergy vaccines containing the 

1 0 immunostimulatory part of each antigen but lacking the part which interacts with IgE and 
thus, normally elicits the allergic reaction. Another new approach is to induce an immune 
response towards human IgE, that normally governs the allergic response, by turning these 
molecules into "non-self* ones e.g. by coupling to IgE of animal origin. The use of these 
hybrid IgE molecules as vaccine is expected to elicit production of anti IgE antibodies that 

1 5 thus, inactivate human IgE thereby preventing allergy. 

Alternatively, allergy may be prevented by stimulating the immune apparatus of the 
newborn in such a way that cellular, IgG and IgA antibody responses to microbial antigens, 
i.e. anti-infection, will be preferred to IgE production against allergens (immune balancing). 
Auto-immune diseases. Another new proposed area for vaccines is to boost tolerance 

20 to "self' antigens in utero and/or in the newborn in order to prevent later development of 
auto-immune disorders such as type 1 diabetes, rheumatoid arthritis, inflammatory bowel 
disease and multiple sclerosis. This may be achieved either by non-specific tilting of the 
newborn immune system towards anti-infection and away from auto-immunity and allergy as 
mentioned above, or by applying the "self molecule (e.g. human insulin or other beta cell 

25 antigens, connective or CNS tissue antigens) coupled to or together with a "danger molecule" 
of microbial origin (e.g. part of the tetanus or cholera toxin, see above) here in order to 
amplify the normal immunotolerance response to e.g. insulin and thus, the natural avoidance 
of juvenile diabetes. 

Preznancv and metabolic diseases. In contrast to the newborn, exposure to a "self' 
30 antigen especially when coupled to a "danger molecule" may in the adult individual lead to an 
immune response to the antigen rather than reinforced tolerance. Such vaccines boosting 
specific auto-immunity may be used for prophylaxis and therapy by eliciting antibodies 
directed against specific "self' target molecules, such as sperm or egg components or human 



WO 01/94599 



PCT/SE01/01280 



15 

gonadotropin (hCG) to prevent fertility, enzymes in cholesterol biosynthesis to prevent 
arteriosclerosis, beta amyloid for prevention and cure of Alzheimer's disease, other brain 
proteins to counteract prion and Creutzfeld Jacobs disease. 

Drue addiction, A further novel application of vaccines includes the use of drugs 
5 molecules such as nicotine or heroin as part of the antigen for induction of anti drug 

antibodies that block its activity and remove the drug and thereby abolishes its CNS effect, in 
order cure addiction. 

Cancer. Novel multicompdnent vaccines containing "danger molecules" may be of use 
also against cancer both by boosting the innate immune defense, by eliciting anti-tumour 

1 0 antibodies and cellular immune responses or by stimulating apoptosis of cancer cells. 

Before the optimal anti-infection and other types of mucosal vaccines can be achieved 
a lot more has to be learned, e.g. about the interaction of C difficile with M-cells and mucosal 
cells in the gut, the uptake, processing and presentation of each antigen, and optimizing the 
choice and presentation of adjuvant so that the immune response can be maximized and 

1 5 modulated in the desired way and thus, perfected for each application in order to obtain a 

desired cellular, IgA, IgG and subclass or combined response. Thus, the choice, size and form 
(soluble, particulate) of antigen, the vector (live?), the choice and form of carrier molecule 
(adjuvant, separate, fused to the antigen), recipient (mother, child, both, adult), timing of 
administration etc need to be tailored in each case. 

20 In the above approaches to vaccination against infection in mammals and for immune 

modulation in newborns and adults recombinant Clostridia producing desired popypeptides 
may be of particular interest as live vectors for mucosal immunization since certain species, 
e.g. G perfringens and G difficile belongs to the normal gut flora. G difficile is particularly 
common in the neonatal period, and toxin negative strains thus can be given orally at all ages 

25 without ethical concerns. C difficile appears to be a particularly good candidate also for 

delivery of antigens for gut mucosal immunization as exposure to microscopic numbers of the 
organism during hospital stay resulting in asymptomatic carriage is enough to yield an 
immune response to its toxins (NEJM 2000). Furthermore, we have observed in animals that 
asymptomatic gut colonization by C. difficile results in an immune response also to its S - 

30 layer protein (see below). These responses are probably enhanced by the the non-toxic part of 
the C difficile toxins that are used for their receptor mediated pinocytosis into the mucosal 
cells. Toxin B then can form membrane pores in the pinocytic vacuoles containing toxin and 
presumably also in phagocytic vacuoles containing whole bacteria (Barth H et al, Low pH 
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induced formation of ion channels by C. difficile toxin B in target cells. J Biol Chem 276(14): 
10670-10676,2001). Thereby the the toxin and other bacterial components may be released 
into the cytosol of the mucosal cells and may spread also to neighbouring cells including to 
antigen presenting cells and thus, enhancing an immune response. Such an unusual adjuvant 
5 effect of C difficile toxin B obtained by breakage of phagocytic vacuoles and intercellular 
spread of internalized antigens and bacteria can alternatively also be obtained e.g. by 
including the membrane attacking peptide listeriolysin O from Listeria monocytogenes in a 
recombinant C. difficile strain in order to boost immunity as has been shown in experiments 
using other gut mucosal delivery systems (Dietrich G et al, From evil to good: a cytolysin in 

1 0 vaccine development, Trends in Microbiology 9:23-28, 2001).. 

Clostridia furthermore represent a unique torpedo able to deliver a desired 
heterologous polypeptide to hypoxic tissues such as tumours. This is because spores of these 
obligate anaerobic organisms given intravenously are known to settle and be able to germinate 
into growing clostridial cells in the hypoxic parts of tumours but not in healthy tissues. This 

1 5 phenomenon was described already in 1955 (reference 7 in Theys J et al, FEMS- Immunol 

Med Microbiol 30:37-41, 2001) and is currently being studied for therapeutic applications by 
several groups (see also Abstracts, The 3 rd Int Meeting on Molecular Genetics and 
Pathogenesis of the Clostridia, Chiba, Japan, June 8-11, 2000). Thus, anti-tumour peptides 
including apoptosis inducing peptides, cytokines, toxins and other proteins, such as enzymes 

20 locally converting pro-drugs to active anti-cancer chemotherapeutic agents, thus minimizing . 
systemic side effects, all produced by recombinant Clostridia inside tumours may become ■ 
novel approaches to cure cancer. We propose that another approach to anti-tumour therapy is 
iv injection of spores of recombinant Clostridia producing various angiogenesis inhibitors, 
mostly of peptide nature and currently used in many clinical trials (phase I: angiostatin, 

25 SU6688, combrestatin A-4 prodrug, PTK787/ZK2284; phase II: endostatin, anti-VEGF Ab, 
TNP-470, 11-12, 2-methoxyestradiol, squalamine, vitaxin, EMD 121974, COL-3, CGS- 
27023A, CAI; phase IE: thalidomide, marimastat, INF-alfa, neovastat, BMS-275291, 
SU5416, AG3340, IM862 as summarized in Larsson H, Regulation of angiogenesis, Thesis, 
2001, Uppsala University, Sweden, ISBN-91 -554-4954-9). 

30 A problem in these studies and possible to overcome by local production of enzymes, 

toxins and other peptides is limited effect in vivo due to short half-life in serum and thus, 
insufficient local concentrations of e.g. the cytostatic agent or the anti-angiogenesis peptide. 
Li the latter application recombinant Clostridia may become particularly powerful (self- 
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accelerating), because the effect of the peptide will further lower the oxygen tension and thus, 
enhance bacterial growth and further production of the peptide(s) 

We further believe, that that iv administration of spores from recombinant Clostridia 
may be used also against other diseases involving local tissue hypoxia such as fibrinolytic and 
other agents for venous or arterial occlusion, and oxygen releasing or other tissue vita l i zi ng 
molecules for tissue necrosis and gangrene. 

We now found and disclose that Clostridial spores may be used to deliver 
heterologous polypeptide(s) to a human or animal body. This is an important step forward. A 
spore is a dormant or resting state of a bacterial cell. Unlike bacterial spores from species 
belonging to the obligate aerobic genus Bacillus (see above), ingested Clostridial spores 
naturally germinate into vegetative bacteria that can grow anaerobically and naturally colonise 
a human or animal gut. Intake of the spores of the genetically engineered Clostridia is 
preferably through the oral route. Spores are able to resist stomach HC1 and digestive 
enzymes. Upon contact with bile they will germinate and establish themselves in the colonic 
flora as vegetative bacteria presenting and/or secreting for example the desired heterologous 
peptide in vivo. 

Therefore, in a further aspect of the invention we provide a therapeutic agent which 
comprises spores of Clostridia transformed with a construct capable of expressing, secreting 
or presenting a heterologous polypeptide in the mammalian body after conversion • 
(germination) to live (vegetative) bacteria. 

The construct is preferably a recombinant gene cassette of the invention as outlined 
before. The mammalian body is preferably a human or. animal. 

The use of Clostridial spores has a number of advantages including low production 
cost, relative ease of production, very long shelf life independently of the mode of storage, 
ease of administration, production of antigen at the site of action, and an oral route of 
immunisation which may be superior to a parenteral one. As mentioned above for bacteria, 
spores are suitable for administration of mixtures (coctails) of recombinant Clostridia having 
different desired properties. 

The use of live vaccines administered via the oral route may lead to further fecal-oral 
transmission and enhanced immunization of a population. On the other hand, this may also be 
considered as to be unwanted spread of genetically modified organisms in the environment. 
Spores of Clostridia survive readily in the environment whereas the vegetative forms have a 
very limited capability to survive in an oxygen-containing milieu. The invention may be 



WO 01/94599 



PCT/SE01/01280 



18 

further developed to create Clostridia that are unable to reconvert to spores, once they have 
germinated in the colon. One way is to modify a genetic element present in C. difficile that is 
similar to the so-called skin (Sigma K intervening) element of B. subtilis. This element 
truncates the sigma K factor necessary for sporulation, and becomes removed by a specific 
5 excision system during sporulation (Krogh, S. et al.(1996) and Takemaru, K. et al.(1995)). By 
genetic modification of the excision system, and insertion of wild-type copies with inducible 
promoters, there is a possibility to create a host strain that is able to sporulate only during 
special conditions, e.g. in the presence of a special chemical (IPTG) or at low temperature (20 
°C). Such construction would allow the production of spores in vitro, whereas no new spores 

1 0 are created in the vaccinated host. The spread of genetically modified Clostridic 

microorganisms to the environment would still occur, but the probability of survival of these 
organisms would in practice be very low or nil. 

Spores of the transformed Clostridia are produced, purified and isolated in the same 
way as for native Clostridial strains. They may thus be readily obtained from a stationary 

1 5 phase culture for example by treatment with ethanol, acid or heat or by combinations of such 
measures followed by purification and isolation in a conventional way. As outgrown spores 
will have the same properties as the parental bacteria, purification of the spores may not even 
be necessary. 

Pharmaceutical and veterinary compositions for oral administration, comprising spores 
20 of transformed Clostridia and pharmaceutical^ acceptable carriers, diluents and excepients 
are further provided by the invention. They have the ability to colonise the intestinal tract of 
humans and animals with live Clostridial bacteria producing and presenting or secreting the 
heterologous polypeptide coded for by the modified gene cassette provided by the invention 
or by any other construct. The pharmaceutical and veterinary compositions may comprise 
25 tablets, capsules, powder for reconstitution or any other form suitable for oral administration 
to humans or animals. Examples of pharmaceutical^ acceptable carriers and diluents are 
lactose and carboxymethyl cellulose. A convenient way of oral administration of these 
therapeutic agents is to provide them as lyophilized, or just dried, powders; shortly before 
administration they are suspended in for example water, physiological saline or fruit juice. 
30 The dose is as indicated above for Clostridial bacteria 

In a further aspect of the invention we provide a method of treatment for the human or 
animal body, which comprises of administering a therapeutic agent comprising Clostridial 
spores capable of expressing a heterologous polypeptide in a human or animal body. 
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As in the treatment with Clostridial bacteria described above it will be possible to treat 
with mixtures of transformed Clostridial spores to obtain in the body a mixture of different 
heterologous polypeptides, which may act synergistically or additively. 

It will be appreciated that the therapy may be either prophylactic or therapeutic. 

The method may be applied to any convenient mammal such as a human or animal. 
Convenient animals include domestic animals such as dogs and cats, also cattle, pigs, chicken 
and horses. 

In a further aspect of the present invention we provide a method for immunisation 
which method comprises administering to a mammalian body Clostridial spores capable of 
expressing a heterologous antigen after germination. 

Examples of convenient Clostridium spores include the spores of C difficile and C 
perfringens, which normally colonise the large intestine of man. For animals other Clostridia 
such as C. tetani may also be useful. Thus, the intensity and duration of antigen exposure in 
the gut (clostridial colonization) in a particular host can be varied by not only exploiting and 
manipulating e.g. adherence of C. difficile (see above), but also by selecting the appropriate 
Clostridium species with regard to the intended host mammal. 

In a further aspect of the invention we provide Clostridial spores transformed with a 
gene expression cassette of the invention. 

It will be appreciated that the methods and materials of the invention may also be used • 
for other applications such as the display of antibodies and peptide libraries. They may also be 
used for screening proteins and antigens and also to provide a support for immobilising an 
enzyme, peptide and/or antigen. 

Sequence numbers of proteins encoded by the ORFs of the invention 



Protein encoded bv ORF 


SEO ID NO: 


ORF 1 


SEQ ID NO: 22 


ORF 3 


SEQIDNO: 23 


ORF 5 


SEQ ID NO: 24 


ORF 6 


SEQ ID NO: 25 


ORF 7 


SEQIDNO: 26 


ORF 9 


SEQ ID NO: 27 


ORF 11 


SEQIDNO: 28 
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Protein encoded by ORF SEP ID NO: 



ORFD SEQIDNO:29 

ORFE SEQIDNO:30 

ORF G SEQIDNO:31 

ORF H SEQIDNO:32 

ORF I SEQ ID NO: 33 



The present invention is particularly directed to a gene expression cassette comprising 
a secretory leader sequence encoding a signal peptide from Clostridium difficile having an 
amino acid sequence selected from the group consisting of SEQ ED NO: 1, SEQ ID NO: 2, 
5 SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6 , SEQ ID NO: 7 and signal 
peptides of analogous exported clostridial N-acetylmuramoyl-L-alanine amidase-like proteins, 
linked to a DNA sequence encoding a heterologous polypeptide. The signal peptides of the 
analogous clostridial N-acetylmuramoyl-L-alanine amidase-like proteins may also be selected 
from Clostridium difficile signal peptides having an amino acid sequence of any one of SEQ 
10 ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ. ID NO: 1 1, and SEQ ID NO: 12. 

The gene expression cassette may further include a promoter of prokaryotic origin, e.g. 
selected from clostridial promoters comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO: 13 -21, or from the promoters of ORFs 1-11 or D-I mentioned 
above. 

1 5 The gene expression cassette according to the invention may further include a DNA 

sequence encoding at least a cell wall binding portion of a protein of prokaryotic origin 
functioning in Clostridia such that a fusion polypeptide may be presented on the outer surface 
of a host cell harbouring the cassette. 

The gene expression cassette according to the invention may in particular include a 

20 DNA sequence encoding at least a functional cell wall binding portion of an S-layer protein of 
G difficile selected from any one of the polypeptides having an amino acid sequence selected 
from SEQ ID NO: 22 - 33 such that a fusion polypeptide may also be presented on the outer 
surface of a host cell harbouring the cassette. The DNA encoding the cell wall binding 
portions of SEQ ID NO: 22-33 may be omitted such that the fusion peptide is secreted into the 

25 surrounding milieu by the host cell harbouring the cassette. 
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Further, the gene expression cassette according to the invention may be such that the 
DNA sequence encoding the heterologous peptide is inserted at a point downstream the first 
(signal) proteolytic cleavage sites in the gene encoding a polypeptide having an amino acid 
sequence selected from SEQ ID NO: 22 - 33, optionally including or excluding its second 
5 cleavage site. 

In addition, the gene expression cassette according to the invention may further 
comprise at least a functional part of a secretory (secA) gene recognizing the signal peptide, 
to allow translocation of a heterologous polypeptide and/or fusion polypeptide across the 
cytoplasmic membrane of a host cell harbouring the expression cassette. For example, the 
1 0 secretory gene may be from C. difficile and encode a polypeptide having the amino acid 
sequence SEQ ID NO: 34. 

In a preferred embodiment of the invention the gene expression cassette is the one that 
is shown in Figure 3. 

The invention is also directed to a vector comprising a gene expression cassette 
1 5 according to the invention, such as a plasmid. 

The invention is further directed to host organism transformed with a vector according 
to the invention for expression of the heterologous polypeptide and/or fusion polypeptide. 

In an embodiment of the invention the host organism is a Clostridium host organism 
transformed with a vector according to the invention for expression of the heterologous 
20 polypeptide and/or fusion polypeptide. 

In a preferred embodiment, the host organism is C. difficile or C perfringens. 
Further, the invention is directed to a pharmaceutical or veterinary composition or 
formulation which comprises Clostridial cells transformed with a vector according to the 
invention, with the ability to present on the cell surface and/or to secrete an expressed 
25 heterologous polypeptide or fusion polypeptide, together with a phannaceutically or 

veterinary acceptable carrier or diluent. Preferably, the composition or formulation is suitable 
for oral or intranasal administration. The composition or formulation according to the 
invention may further comprise, as adjuvants, non-toxic motifs of the C. difficile toxins A 
and/or B that enable the heterologous polypeptide and/or fusion polypeptide to enter the 
30 colonic mucosal cells of a mammal by receptor-mediated endocytosis, and/or a portion of 
toxin B responsible for its intracellular and intercellular spread. The composition or 
formulation according to the invention may alternatively additionally comprise a further fused 
or separate carrier peptide or adjuvant, in addition to the expressed heterologous polypeptide 
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and/or fusion polypeptide, to elicit a stronger or differently directed immune response than 
that against the expressed heterologous polypeptide acting alone. 

The invention is, in another aspect, directed to a vaccine which comprises a Clostridial 
bacterium transformed with a vector according to the invention and capable of presenting on 

5 the surface of the bacterium and/or secreting an antigen in a human or animal body, and 
optionally an adjuvant described in conjunction with a composition or formulation of the 
invention. The vaccine may comprise a mixture of at least two differently engineered 
Clostridia strains, each capable of presenting on the surface of the bacteria and/or secreting a 
different heterologous polypeptide and/or fusion polypeptide. Further, the vaccine may 

1 0 comprise spores of Clostridia cells or bacteria transformed with a vector according to the 

invention and capable of germinating into cells which are able to grow, express, and present 
or secrete a heterologous polypeptide and/or fusion polypeptide, and optionally also an 
adjuvant described in conjunction with a composition or formulation of the invention, in a 
mammalian body. The vaccine may comprise a mixture of spores from at least two differently 

1 5 engineered Clostridia strains. Each of these strains is capable of presenting on the surface of 
the bacterium and/or secreting a different heterologous polypeptide and/or fusion polypeptide. 
The spores are preferably from C. difficile or C. perfringens. 

The invention is in yet another aspect directed to a medicament which comprises.a 
Clostridial bacterium transformed with a vector according to the invention and capable of 

20 presenting on the surface of the bacterium and/or secreting a heterologous polypeptide and/or 
fusion polypeptide in a human or animal body, and optionally an adjuvant described in 
conjunction with a composition or formulation of the invention. The medicament may 
comprise a mixture of at least two differently engineered Clostridia strains, each capable of 
presenting on the surface of the bacteria and/or secreting a different heterologous polypeptide 

25 and/or fusion polypeptide. Further, the medicament may comprise spores of Clostridia cells or 
bacteria transformed with a vector according to the invention and capable of ge rm ina t ing into 
cells which are able to grow, express, and present or secrete a heterologous polypeptide and/or 
fusion polypeptide , and optionally an adjuvant described in conjunction with a composition 
or formulation of the invention, in a mammalian body. The medicament may comprise a 

30 mixture of spores from at least two differently engineered Clostridia strains. Each of these 
strains is capable of presenting on the surface of the bacterium and/or secreting a different 
heterologous polypeptide and/or fusion polypeptide. The spores are preferably from C 
difficile or C. perfringens. 
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The invention is in still another aspect directed to a method for vaccination of a 
mammal, which comprises administering a therapeutically or prophylactically effective dose 
of a vaccine according to the invention to the mammal. Spores used in the vaccine are 
preferably from C. difficile or C perfringens. 
5 The invention is also directed to a method for prophylactic or therapeutic treatment of 

a mammal, which comprises administering a therapeutically or prophylactically effective dose 
of a medicament according to the invention to the mammal. Spores used in the medicament 
are preferably from C. difficile or C. perfringens. 

The invention is additionally directed to a G rfz^cz'/e-associated diarrhea (CD AD) 
1 0 vaccine comprising spores according to the invention and capable of expressing, after 
germination, 

(i) relevant parts of the G difficile toxins, alone or together with a 

(ii) suitable adjuvant to provide an IgA response to the toxin antigenic epitopes and 

(iii) S-layer protein antigenic variants (serotype antigens) or fimbrial antigens to obtain, 
1 5 after administration to a mammal, a polyvalent anti-S-layer (or anti-fimbrial) IgA 

response preventing G difficile colonization of the mammal. 
The invention will now be illustrated but not limited by reference to the following 
Figures, Specific Descriptions, Tables, and Examples wherein: 

Figure 1A shows our first simple layout of the C difficile strain 630 genomic segment 
20 encoding the S-layer genes. ORF2 represents secA and ORFs 1,3, 7-9 and 1 1 S-layer protein . 
genes. For explanations of the these and other ORFs, see Table 1 and Example 9. 
Figure IB represents the result of comparisons between three of the S-layer ORFs with 
published sequences of other genes. The "amidase enhanced precursor" sequence is equivalent 
to the N-acetyl muramoyl L-alanine amidase motif mentioned in the text. 
25 Figure 2. Defining the upstream region of ORF 1-12. The figure illustrates additional 

information and genetic organisation of the C difficile S-layer genes (cf. Figure 1), found 
after searches in the revised G difficile database at the Sanger Centre. The genes upstream of 
ORF 1 to 12 are denoted A to I (see also Table 1). The numbers +1, +2 and +3 indicate the 
reading frame of the ORFs relative to the start point of the contig. ORFs D, E, G, H and I had 
30 the amidase motif typical of genes encoding the G difficile S-layer proteins. 

Figure 3 shows an example of a preferred gene expression cassette here taken from G 
difficile strain 630 and containing a strong promoter, the secretory leader peptide from ORF1, 
the signal peptide cleavage site the area of insertion of foreign DNA encoding the 
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heterologous peptide, the second (optional) peptide cleavage site in the N-acetyl muramoyl L- 
alanine amidase motif, and the secA gene (ORF2). 

Figure 4 shows a preferred strategy for introducing a recombinant gene cassette of the 
invention back into C. difficile via B. subtilis . 

Figure 5. Further details of a particular C.difficile S - layer gene cassette. This is a 4960 bp 
cassette taken from strain 630 encoding an S-layer protein of 2160 bp in its original form 
(ORFl). The 210 bp region (pr, promoter) upstream of ORF1 includes gene control elements 
for the S-layer protein included in the cassette. Also shown are an intervening 244 bp region 
and the 2346 bp sec A sequence. 

Figure 6. Strategy for the engineering of ORFl (Figure 5) to express a recombinant protein 
(for example as outlined in Example 2 and 2A). The 613 bp variable region (vr) is replaced 
by a foreign DNA. For example, three fragments encoding the Hepatitis B virus surface 
antigen (HBsAg) were selected: (i) the full length HBsAg that includes the pre SI, pre S2 and 
the S gene (1207 bp); (ii) the S gene (740 bp); and (iii) the subtype from the S gene (minimum 
antigenic epitope, 421 bp). 

Figure 7. Cloning strategy for the construction of ORFl - secA (with and without the native 
promoter) together with the different lengths of the HBsAg antigenic loop (full length, S gene 
and Sub type - see legend to Fig. 6 above) using a PCR based method and cloning into the TA 
vector in E. coli. The primers indicated were used also for PCRs to help checking the 
correctness of the constructs. The expected and obtained constructs were 5564, 5097 and 4778 
bp respectively. 

Specific descriptions 

A. Characterisation of the genomic segment responsible for S-layer protein expression in 
C. difficile (outlined in Fig. 1 and Fig. 2, see also Table 1 and Example 9): 

1 . The main surface layer proteins expressed by C. difficile strain 630 has been found to 
encoded by a single open reading frame (ORFl) encoding a 72 kDa protein. The gene 
product of ORF1 is postranslationally cleaved at two sites yielding three different 
peptides; the leader peptide and the final S-layer proteins of apparent molecular weights of 
36 kDa and 45 kDa. 



WO 01/94599 PCT/SE01/01280 

25 

2. The C-termimis of ORF1 shows similarity to N-acetyl muramoyl L-alanine amidase, and 
the N-tenninus shows weak similarity to surface layer proteins from! helveticus (Fig. 
IB). 

3 . The gene immediately downstream of ORF1 (ORF2) encodes the SecA protein 
responsible for secretion of proteins with signal peptides. 

4. The genes downstream of secA encodes proteins with similarities to OKF1 (ORF3, 5-7, 9 
and 11, see Fig. 1). 

5 . ORF1 is efficiently expressed and its product is efficiently exported in strain 630, whereas 
e.g. ORF3 is expressed more than 100-fold less in strain VPI 10643, indicating a strong 
termination between ORF1 and 3 (as judged by identification of exported proteins by two- 
dimensional gel electrophoresis). 

6. The upstream region including the putative promoter has not been characterized 
functionally, but the very high expression of ORF1 in various growth conditions indicates 

• the action of a strong, constitutive promoter. It is also active in E.coli (see Example 2). 

7. In the revised C. difficile sequence database from Sanger Centre, the upstream region of 
ORF 1 was included and revealed 9 new ORFs (A-I) of which 5 (D, E, G, H and I) had the 
N-acetylmuramoyl L-alanine amidase motif typical of the C. difficile S-layer protein 
ORFs (Fig. 2 and Table 1). The putative promoter region for ORF1 is thus situated 
between ORFI and ORF (See Example 1 A and Table 2). 

The S-layer proteins from strain VPI 10463 have similar molecular weights but different pi as 
compared to those of strain 630, and the N-terminal sequences of the two S-layer proteins 
from VPI 10463 showed no similaritiy with those of strain 630. Studies of strains from 
different serogroups showed that the S- layer proteins vary in pi and molecular weight. The 
downstream region of the gene segment may in part be more conserved, since the N-terminal 
sequence from another extracellular protein from strain VPI 10643 was identical to ORF3 of 
strain 630. Our results indicate that ORFI is located at part of the chromosome that is capable 
of expressing and exporting various S-layer proteins depending on the strain. 

B. Designing a preferred engineered gene cassette based on the C difficile S-layer gene 
segment data (outlined in Fig. 3), 

1. Any strong prokaryotic promoter functional in Clostridia can be used to express the 
heterologous peptide, e.g. the promoter of ORFI or any of the promoters of genes encoding 
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other highly expressed proteins in C difficile such as certain electron transfer proteins (our 
unpublished data and Fix A and Fix B in Example 9) or ribosomal proteins. 
2. A secretory leader peptide, preferably the leader peptide from ORF1, is fused with the 
heterologous peptide, to ensure its translocation across the cell membrane 
5 3. Depending on the desired fate of e.g. the antigen (secreted, surface presented or both), the 
heterologous peptide is optionally fused to the amidase part of the S-layer protein optionally 
including the part involved in the proteolytic cleavage event (Figure 3). Thus, for ma x imum 
release of the peptide the secretory leader of e.g. ORF1 may be sufficient. On the other hand, 
maximum cell-wall binding may require fusion to the amidase portion but omitting the 

1 0 proteolytic cleavage sequence in the middle of the gene (Figure 3). If both free and bound 
heterologous peptide is desired one recombinant cassette of each type present in the same 
Clostridium strain or a mix of two different strains, each harbouring one the recombinant 
cassettes, can be used. The peptide cleavage site may be exploited if for instance the antigen 
and an adjuvant are produced in a fused form, to obtain equal amounts of the two, but are 

1 5 desired as separate peptides on the outside of the producer bacterium: To what extent parts of 
the N-terminal (variable) portion of e.g. ORF1 can be used to optimize the localization of the 
heterologous peptide requires further experimentation: 

4. The secA gene is usually included in the construct to ensure efficient translocation of the 
polypeptide across the cell membrane. 
20 5. The gene construct is made in plasmids suitable for transformation of both E.coli and C. 
perfringens (e.g. pJIR750 or 751) or in plasmids suitable for conjugation into C. difficile via 
B, subtilis (e.g. pCI195 or pSMB47, Figure 4). 

Examples 

25 

Example 1 

Defining a minimal gene segment responsible for production and exp ort of the heterologous 
protein Tan engineered gene cassette, see also Figure 3). 
A. Defining the promoter start (the 5' end), 
30 The promoter region may be further characterised in different C.difficile strains, for example 
by the following steps: 

1 . Defining the DNA upstream of ORF1 by circular PCR. One or several restriction enzymes 
are used to cut the DNA outside ORF1. Useful enzymes are Ndel, TaqI, PstI, and BsmAI. 
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A Southern blot is performed using a probe against ORF1 to confirm size of the fragment 
(optimal size is around 1-3 kb). The cleaved genomic DNA is then ligated into circles. 
PCR is then performed using primers (directed outwards of each other) directed against 
ORFL The PCR product is then cloned and sequenced. Optionally, PCR is performed 
5 with one primer directed against ORF1 (lower primer) and different arbitrarily designed 

primers. The products are then cloned and sequenced. 

2. Identification of the transcription start. Primer extension with primer located at the 5 
prime end of the genomic segment is used. 

3. By computerised search for homologies to known promoters. The new sequence data for 
1 0 strain 630 enabled a search for putative promoters upstream of ORF1 and the result is 

shown in Table 2. Due to the AT rich genome, several putative promoters were found The 
actual promoter start has to be experimentally determined as described in point 2 above. 

B. Defining the termination point (the 3' end). 

1 5 Primers directed at different parts of ORP1, 2 and 3 to determine transcript abundance using 
Northern blots. Termination loops in the RNA may be identified by computer analysis. 

C. Conservation of the S-layer locus in other C. difficile strains: characterization of part 
of the S-layer genomic segment (ORF1 and proximal parts of its upstream region and of 

20 part of ORF2) in different C. difficile serogroups in order to check for conservation of 
this region of the genome. 

Primer pairs that are directed against the identified upstream-transcription start region and the 
proximal part of ORF2 were designed. PCR was performed on different strains belonging to 
all serogroups to confirm the generality of the expression center of the S-layer locus. 

25 

(i) PCR performed with chromosomal DNA as template from the different serogroups of C. 
difficile. 

The primers used: 

CONS 1 : 5 prime- TAT AAT GTT GGG AGG AAT TT A AGA - 3 prime, total length 24 nt 
30 (5 prime end starts at 8th nt upstream of ORF1 , ends at 32nd nt) 

CONS2: 5 prime- CAA ATC CAA ATT CAC TAT TTG TAC - 3 prime, total length 24 nt 
(5 prime end starts at 2983rd nt downstream of ORF1, ends at 2959th nt) 
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Total size of expected PCRpdt (from strain 630 sequence): 2975 bp (includes ORF1 and the 
proximal part of ORF2). 

The enzyme/system used: Expand™ Long Template PCR System from Boehringer 
Mannheim. 

5 Reaction conditions (as specified by the manufacturer): In a total reaction volume of 50ml, 
350mM dNTPs, 300nM primers, 50ng chromosomal DNA template, lx supplied PCR buffer 
with 1.75mM MgCl 2 , and 2.5U of a mix of Taq and Pwo DNA polymerase. 10ml of the 
reaction mix was run on a 0.8% Agarose-TBE (Tris-Borate- EDTA buffer) to check for 
product. 

1 0 PCR cycle conditions: Initial denaturation - 92°C for 2 mins 
For 30 cycles - denaturation - 92°C for 10 sees 
annealing - 40°C for 30 sees 
elongation .- 68°C for 2 mins 
Final elongation - 68°C for 2 mins 

15 

PCR results: 

Seroeroup tested Product size No. of Expts 

A none 3 

B around 2900bp 2 

20 C around 2900bp 2 

D around 2900bp(lesser amt) 2 

F around 2900bp 2 

G around 2900bp 2 

H pdt > 2900bp (lesser amt) 3 

25 I around 2900bp 2 

K around 2900bp 2 

X around 2900bp 2 

A2 around 2900bp (lesser amt) 3 
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Seroeroup tested Product size No. of Expts 

A3 around 2900bp (lesser amt) 2 

A4 around 2900bp (lesser amt) 2 

5 A5 none 2 

A6 around 2900bp (lesser amt) 2 

A8 around 2900bp 2 

A9 around 2900bp (lesser amt) 2 

A10 around 2900bp 2 

10 SI around 2900bp 2 

53 around 2900bp 2 

54 none 2 

VPI 1 0643 (G) around 2900bp 4, last PCR was weak 

630 (X) around 2900bp (expected size) 5, last PCR was -ve 

1 5 Serogroups A, A5 and S4 did not give any PCR product with several attempts. 

The PCR reactions were very sensitive to template condition, which had to be prepared fresh. 

fin Restriction Enzvme Analysis of the PCR products , 

12ul PCR reaction mix (reactions done above) from most of the serogroups (except A, A5-6, 
20 A9, and S4) were further subjected to Rsal (Boehringer Mannheim) and Sau3 Al (Amersham 

Pharmacia Biotech) digestions. Total digestion reaction mixes were run on a 0.8% Agarose- 

TBE (Tris-Borate- EDTA buffer) to obtain respective digestion patterns. 

Each serogroup tested appears to give rise to unique restriction pattern though the 

apparent size of the PCR product appears to be similar except for gp H. 
25 Conclusion: 

It appears that the promoter-ORFl-ORF2 organization and the size of the genetic segment 
between the putative promoter and ORF2 (secA gene), i.e. the ORF 1 equivalent, is generally 
conserved in C difficile. There may be sequence variation between ORF 1 eqivalents from 
strains of different serotype and also between two strains of the same serotype, presumably 
30 reflecting the variable (non-amidase) portion of these genes. 
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Example 2 

Clonin g of the C. difficile promoter-ORFl-ORF2 "cassette" from st rain 630 and construction 
of a recombinant cassette in £. coli encoding heterologous p roteins to be transferred to C 

5 perfringens and C difficile and used for immunisation . 

1. Primer pairs are designed that include the promoter region and part of ORF1 including the 
leader peptide sequence (Figure 5). PCR is performed followed by cloning of the product 
into E. coli-C. perfringens shuttle vectors pJIR750 or pJIR751 (Plasmid 229: 233-235, 
1 993) in frame of a reporter gene such as p-lactamase or at least a part of the hepatitis B 

1 0 virus (HBV) antigen. A convenient source of HBV antigen is the SMI strain no. 8423/87 

having the genotype A and subtype adw2 (cf. Magnius et al, J.Gen. Virology, 1993, 74, 
1341-1348). The plasmid is isolated from E. coli, purified and used to transform G 
perfringens, and the engineered strain is isolated for further use. The secA gene is 
optionally included in the construction to optimise secretion. 

1 5 2. Check for expression and secretion of the reporter gene using an HBV antigen based assay 
or a ^-lactamase assay of the transformed E.coli and G perfringens strain. Gnotobiotic 
mice and rats are fed with spores of the engineered Clostridium strain to obtain 
colonisation. Expression of antigen is checked by the HBV antigen based assay or 0- 
lactamase assay in feces and immune response by antibody response in feces and in 

20 serum. 

A. Cloning of ORF1 - ORF2 (secA) in order to construct a fusion with a foreign antigen 

PCR was performed with chromosomal DNA from Strain 630 as template. The primers used: 

25 AMP1: 5* - GGA ATT CCATGA ATA AGA AAA ATA TAG CA - 3% 

total length 29 nt (5'end starts at the first codon of ORF 1 , ends at 7th codon ) 
AMP2: 5' - CGG GAT CCC GTT TTT AGT TAA ATT TAT ATA AG - 3', 

total length 32 nt (5 'end starts at starts at the stop codon for secA) 
Analogous PCRs but with the first primer in the upstream region in order to include a putative 
30 native promoter were also performed. 

Total size of expected PCR product (from strain 630 sequence) : 4770 bp (4960 bp including 
the promoter) (Figure 5). 



WO 01/94599 



PCT/SE01/01280 



31 

The enzyme/system used: Expand™ Long Template PCR System from Boehringer 
Mannheim. Reaction conditions (as specified by the manufacturer): In a total reaction volume 
of 50ul, 350mM dNTPs, 300nM primers, 150ng chromosomal DNA template from strain 630, 
I x supplied PCR buffer with 1.75mM MgCl 2 , and 2.5U of a mix of Taq and Pwo DNA 
5 polymerase. 1 Oul of the reaction mix was run on a 0.8% Agarose-TBE (Tris-Borate- EDTA 
buffer) to check for product. 

PCR cycle conditions: Initial denaturation - 92°C for 2 mins 
For 30 cycles - denaturation - 92°C for 10 sees 
annealing - 40°C for 30 sees 
1 0 elongation - 68°C for 4 mins 

final elongation - 68°C for 5 mins 

Results: 

The expected PCR products were obtained and cloned into pGEMT vector (Promega). The 
plasmid containing the insert will be subjected to partial digestion with PvuII enzyme (sites at 
1 5 position 282 and 895 of the insert) to eliminate the 613 bp internal fragment from ORF1, 
where the foreign antigen is planned to be inserted (Figure 6). The digestion time had to be 
standardised. The foreign antigen used was the hepatitis B virus (HBV)surface antigen 
(HbsAg). 

Three HBsAg regions were used (Figure 6): (a) Total (413 aa, 3-1243 bp; 1240 bp) 
20 (b) Central (259 aa, 465-1243 bp; 778 bp) 

(c) C-terminal (140 aa, 822-1243 bp; 421 bp) 
References for HBV antigen sequences include Prange et al, 1995, 14 (2), 247-256 and Chen 
etal, 1996,93, 1997-2001. 

Alternative antigens that may be used include relevant epitopes of the rota virus and 
25 hepatitis A virus. 

B. Cloning of the C. difficile OKFl-secA cassette containing the three HBsAg DNAs in E. 

colL 

Result: 

30 The cloning strategy (Figure 7) using PCR and the TA vector was successful according to 
DNA analyses including agarose gel electrophoresis and PCR. 
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C. ELISA to check for expression in E. coli of the three ORFl-HBsAg-seo4 DNA 
constructs in the TA vector. 

A commercially available ELISA (Abbott) was performed on sonicated samples of 
three overnight cultures of E. coli each containing one of the three different HBsAg DNAs 
5 inserted into OKFl-secA (with the native C.difficile ORF1 promoter) and cloned into the TA 
vector. 
Results: 

Cut off value = 1.0, higher values are regarded as positive. 



Sample 
(plasmid) 


Culture 
OD* 


Sonicate- 
supernatant 

(periplasm 
+cytoplasm) 


Pellet following 

sonication 
(cell membrane) 


Spent medium 


1.3 


1.1 s 


0.92 


Test Failed 


0.61 


1.3" 


1.9 


4.76 


3.70 


0.69 


2.8 


1.6* 


0.99 


1.33 


0.74 


2.8 a 


1.8 


1.14 


4.33 


0.69 


3.7 


1.4* 


18.56 


34.58 


0.70 


3.7' 


2.0 


5.10 


5.11 


0.69 


TA vector 


2.0* 


0.70 


1.13 


0.69 


TA vector 8 


2.0 


0.74 


0.91 


0.70 


PBS(buffer) 


NT 


0.77 


NT 


NT 



10 1.3 : Full length HBsAg(pre SI, pre S2 and S gene), 
2.8 : S gene, 

3.7 : Partial S gene(the minimum HBsAg epitope) 
a : Duplicates 

# : Samples stored overnight in cold before analysis. 
15 NT : Not tested 

* : Culture OD at the time of harvest. Cells from 1ml culture were pelleted and resuspended 
in 0.5ml of PBS before sonication. 

The above experiment was repeated with cells from 5ml cultures resuspended intolml PBS to 
20 see if increased protein concentrations would give higher titer values. Here a control E. coli 
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culture with the TA vector carrying the ORFl-secA cassette but without any HBsAg insert 
was also included. 



Sam nle 


Culture 
OD 


Sonicate-siiDerncttant 
(cytoplasm) 


Pellet following sonication 
(cell membrane) 


1.3 


2.4 


8.17 


3.08(1/10 dil) 


2.8 


1.8 


5.07 


2.02(1/10 dil) 


3.7 


2.4 


42.0 


1.15(1/100 dil) 


orflsecA (no 
HBsAg insert) 


2.4 


Test Failed 


0.92 



5 Concusion: 

All three constructs containing HBV DNA expressed both soluble and and particulate HBsAg. 
Thus, the cloning proved to be successful and the native C. difficile ORF1 promoter was to 
some extent functional also in E. coll 

10 Example 3 

Cloning of the ORF1-ORF2 (secA) "cassette" into C. difficile and construction of recombinant 
protein useful for immunisation studies. 

1 . Cloning of a gene expression cassette according to examples 1 and 2 in plasmids pCI195 
or pSMB47 followed by transfer to a non-toxigenic strain of C. difficile by mating via B. 

1 5 subtilis is performed according to standard methods reported (J. Gen. Microbiol 136: 

1343-1349, 1990; Plasmid 31: 320-323, 1994, see also Figure 4). The engineered strain is 
isolated for further use. 

2. Check for expression, secretion and antibody response in vitro and in vivo (see Example 
2). 

20 

A. Cloning of the ORFl-HBsAg-s*?c4 constructs into the shuttle plasmid vector pJIR750 

Results: 

Ligation mixtures containing the desired recombinant plasmids were obtained an judged by 
agarose gel electrophoresis and PCR. However, upon transformation into E. coli the plasmid 
25 constructs were fragmented. This indicated that a plasmid replication machinery better at 
handling large plasmids in E. coli than that of pJIR750 (colE based) needs to be used. Also, 
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attempts to transform C perfringens with our recombinant pJIR750 plasmids are being 
performed. 

Example 4 

5 Production of transformed C. perfringens expressing and presentin g foreign antigen for 
vaccination. 

A. Live bacteria, 

C. perfringens, transformed with a gene cassette coding for a foreign antigen fused to ORF1 
1 0 and obtained as in Example 2 is cultivated under anaerobic conditions in a fennenter until a 

cell density of at least 10 7 bacteria per ml is obtained. The broth is cooled to 1 1°C, the bacteria 
recovered by centrifugation and the supernatant discarded. The pellet is twice washed with 
cold 0. 1 M phosphate buffer, pH 7 and centrifuged. The final pellet is resuspended in the 
phosphate buffer to a concentration of about 10 9 organisms per ml. One ml portions of the 
1 5 suspension are dispensed into glass ampoules and freeze-dried to remove the water. The final 
product is obtained by sealing of the ampoules in vacuo. 

B. Bacterial envelopes. 

Transformed C perfringens bacteria are produced as in A above. The final pellet is suspended 
20 in 50 mM Tris-HCl, pH 7.2, and sonicated for 1-10 min (40 watt, Bransic Sonic Powerco. 
Sonicator). Triton X-100 is added to a final concentration of 2% and the mixture incubated 
under stirring at 1 1°C for 30 min.. The cells are collected by centrifugation and washed three 
times with cold distilled water. The pellet is resuspended in 5 mM MgCl, containing DNase 
(1 lmg/ml) and RNase (llmg/ml) and incubated for 15 min at 1 1°C. The resulting envelopes 
25 are recovered by centrifugation, washed three times with cold distilled water, resuspended in 
cold distilled water and freeze-dried to give the envelopes as a powder suitable for 
formulation in capsules or tablets, for suspension in e.g. physiological saline for oral 
administration. 

30 Example 5 

Production of Clostridial spores, germination of spores in vitro and in vivo, and colonization 
and immune response to C difficile in animals given spores orally 
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Clostridial strain producing the heterologous peptide is allowed to grow anaerobically 
in Peptone-Yeast extract-glucose or another medium optimal for sporulation for 48-72h to 
ensure maximum conversion of the vegetative bacteria into spores during the stationary phase. 
The rem ainin g vegetative bacteria are killed by heat or ethanol treatment, eliminated by the 
5 bacteriolytic enzymes lysozyme or lysostaphin and the remaining spores are purified by 
centrifugation. 
Results: 

Spores from C. difficile strain 630 were readily observed on 3-day old blood agar 
plates as well as in PYG medium, whereas spores from G perfringens NCTC 8/98 were 
1 0 found only during growth in Duncan Strong (DS) medium, after preinocubation in Fluid 
Thioglycolate (FTG) medium. 

Prepared spores (washed in ethanol and resuspended in sterile PBS) were checked for 
germination ability on plates (TCCFA) as well as in gnotobiotic rats. Feces from 1 week old 
rats were positive for bacterial growth in feces 1-2 days after receiving C difficile or G 
15 perfringens spores orally, confirming the ability of the spores to germinate also in vivo and 
lead to colonization of the animal gut. 

Antisera from 5 rats colonised for one week by C. difficile were pooled and used for 
Western blotting of C difficile protein extracts. Western blotting revealed immunological 
reactions to bands corresponding to the C. difficile S-layer proteins confirming that antibodies 
20 were produced against these C difficile antigens upon feeding with spores, spore germination 
and colonization of the animals. 

Example 6 

Production of transformed C di fficile spores for use in oral immunisation. 
25 A. Capsules. 

The C difficile spores obtained according to Example 5 are mixed together with Mg 
stearate (1%) and lactose (30%), granulated in ethanol and compressed to tablets, containing 
10 6 spores, which are filled into gelatine capsules. 
B. Powder for suspension. 
30 The C difficile spores obtained according to Example 5 are mixed together with Mg 

stearate (1%) and carboxymethyl cellulose (25%), granulated in ethanol. The granulate is 
dried and dispensed into vials to give an amount of about 10 6 spores in each vial. For oral 
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administration the content of the vial is suspended in water or for example orange juice 
immediately before intake. 

Example 7 

Use of S-laver genes for epidemiological typing. 

Present methods to detect and follow the spread of certain C. difficile strains in the 
environment and between infected patients (i.e. "fingerprinting , 0 include e.g. serotyping and 
PCR ribotyping. PCR ribotyping is a PCR based approach to amplify the region between the 
16S and 23S genes of C. difficile, and which has been shown to resolve and detect over 100 
different patterns or strains. Different serotypes are likely to represent differences of the 
surface-exposed proteins, i.e. variations of S-layer proteins among strains. Our results with 
PCR amplification and following cleavage with restriction enzymes indicate that this region is 
present in almost all serogroups and that the cleavage pattern also varies among these groups 
(see Example 1C, (ii)). Thus, a molecular method including PCR combined with restriction 
enzyme cleavage or direct sequencing of the variable part of the ORF1 or another part of this 
segment may be a method which is faster and more reliable than serotyping and in particular 
also more reliable than PCR ribotyping for fingerprinting. 

Example 8 

Vaccination against CD AD 

Immunity to CD AD after an episode of the infection is regarded tobe short (months). 
This may be due to that anti-toxin antibodies are mainly of the serum IgG classes and not the 
secreted IgA class made to protect the gut mucosal surface, because the toxins are internalized 
by the gut mucosal cells (see above) and not by the M-cells specialized in furthering an IgA 
response. A further problem may be that immunity to the 20 C. difficile S-layer serotypes is 
required for prevention of colonization and thus the best protection against infection. For 
these reasons, it is likely that injectable vaccines against CD AD based on the toxins and 
under development may turn out to offer poor protection. 

As an alternative, we provide a polyvalent live oral vaccine containing (i) the most 
prevalent toxin-producing serotypes (S-layer variants), here attenuated by knock-out of their 
toxin genes, and (ii) carrying a recombinant OKFl-secA cassette encoding relevant parts of 
the toxin genes and (iii) an adjuvant peptide ensuring uptake of the immunogenic toxin 
epitopes by e.g. M-cells in order to obtain an IgA anti-toxin response. 
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Example 9 

Analysis of extracellular and membrane fraction proteins in Clostridium difficile by two- 
dimensional PAGE, N-terminal sequencine and data base searches - identification of genes 
5 encoding the S-laver proteins 

Identification of extracellular proteins 

Analysis of its extracellular protein pattern by 2-D PAGE showed that two proteins of 
50 kDa and 36 kDa were very abundant. Subsequent analysis of membrane preparations from 

1 0 G difficile VPI 10643 corroborated the almost exclusive fractionation of the 50 kDa and 36 
kDa proteins into the membrane fraction, when compared with the soluble fraction. Also in 
strain 630 two proteins with similar molecular weights but with different pi were abundant in 
the extracellular as well as in the membrane fraction. Thus, the 50 kDa and 36 kDa proteins 
were likely to constitute the C difficile surface(S)-layer proteins, which are known to be 

1 5 partially shed from the bacterial surface into the culture supernatant (Luckevich and « 

Beveridge, 1989; Tsukagoshi et a/., 1984). The N-terminal sequence of spot no. 1 from VPI 
10463 did not show any homology to other proteins in the G difficile strain 630 genome 
database (Table 3). The N-terminal part of spot no. 2 showed similarity to an open reading 
frame encoding a 72 kDa protein in the C difficile genome database (Table 3; ORF1, see also, 

20 Figure 1). However, only nine out of 15 amino acids matched close to. the N-terminus of 
ORF1. Strikingly, the N-terminal sequences of the corresponding proteins from strain 630 
were different from those of VPI 10463 and both matched to ORF1 but at two different 
positions (spot No. 10 and 1 1 in Table 1). 

Several proteins were specifically found in PY cultures, i.e. during high toxin 

25 production (Table 3, spot no. 3, 4, 5, and 6). The N-terminal sequence of spot no. 3 matched 
with an ORF of 47.5 kDa in the C difficile genome database. This ORF showed weak 
homology to a hypothetical protein in the Plasmodium falciparum genome database. The N- 
terminal sequence of spot no. 4 matched with an ORF of 39 kDa that showed homology to a 
phage-like element PBSX protein (XkdK) from Bacillus subtilis. The N-terminal sequences of 

30 spot no. 5 and 6 matched with two ORFs of 38 and 22 kDa, respectively, and these ORFs had 
the highest similarity to the FixB and FixA proteins from Escherichia colu Spot no. 7, 8 and 9 
were absent in PY but abundant in culture medium from PYG cultures. Spot no. 7 and 8 
matched to an ORF located on the same contig as ORF1 (Table 3; ORF3). The N-terminus of 
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spot no. 9 matched to a central part of ORF1, and is likely to be a proteolytic fragment of a 
protein encoded by GRFL. 

Analysis of the surface layer genes 

The identification of the S-layer genes revealed a genomic segment including seven 
genes (ORF1, 3, 5-7, 9 11) with significant homology to N-acetyl muramoyl L-alanine 
amidase (CwlB/LytC) and modifier protein of major autolysin (LytB) from Bacillus subtilis 
(Fig.l, Table 4). In addition to the LytB/LytC similarity, the N-terminal part of ORF6 showed 
similarity to eukaryotic cysteine proteases, and the highly expressed OKF1 (above) showed 
weak similarity to S-layer proteins Scorn Lactobacillus and Streptococcus spp. (Fig. 1). The N- 
terminus of ORF1 contained a typical signal peptide for export via the Sec-dependent 
secretion and the predicted cleavage site was identical to that found in the protein sequence 
(not shown). However, no typical protein cleavage site was identified within ORF1 that would 
allow processing of the 72 kDa protein further to give the finally sized S-layer proteins found 
(50 and 36 kDa). Strikingly, no significant match between the C. difficile S-layer ORFs and 
the S-layer homology motif (SLH domain) found in all presently known S-layer proteins was 
obtained (not shown). Most of the remaining genes in this genomic segment showed similarity 
to genes involved in secretion, polysaccharide and capsule synthesis (Fig. 1; Table 4).. At least. 
2 other genomic sequence segments were found that contained genes similar to CwlB/LytC, 
indicating a complex variability (not shown). 

Summary of results 

The most dominant surface-exposed protein in many bacterial species is the S-protein. 
This protein crystallizes into a regular monolayer on the outside surface of the bacteria: the S- 
layer. The S-layers satisfy multiple roles for the cell and function as protective coats, as 
structures involved in cell adhesion and surface recognition, as molecular seives, as molecular 
and ion traps, as scaffolding for enzymes and as virulence factors (Sleytr and Beveridge, 
1999; Sara and Sleytr, 2000). Though all S-layers share general features (all are made of 
relatively large proteins, self-assemble and are paracrystalline), comparative studies indicate 
that S-layers are non-conserved structures and are of limited taxonomical value (Kuen and 
Lubitz, 1996; Sleytr et al. 1999). Chemical analysis and genetic studies of a variety of S- 
layers have shown that they are composed of a single, homogenous protein or glycoprotein 
species with molecular weights ranging from 40 to 170 kDa. The S-layers of Clostridium 
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difficile (Takeoka et al., 1991) ami Bacillus anthracis (Mesnage et al. f 1998; Etienne- 
Toumelin et al., 1995) consist of two types of S-layer subunits which together form a defined 
lattice type but do not cross-react with polyclonal antibodies. Typically, S-layer proteins are 
often weakly acidic proteins (pis between 4 to 6), containing 40-60% hydrophobic amino 
5 acids, and possess few or no sulfur-containing amino acids (Messner, 1996). S-protein 

production is directed by single or multiple promoters in front of the S-protein gene, yielding 
stable mRNAs. Most bacteria secrete S-proteins via the general secretory pathway (sec- 
pathway). Silent S-protein genes have been found in Campylobacter fetus and Lactobacillus 
acidophilus. These silent genes are placed in the expression site in a fraction of the bacterial 

1 0 population via inversion of a chromosomal segment (Boot and Pouwels,! 996). 

The S-layer has been detected in some Cdifficile strains and preliminary 
characterization has been done from Cdifficile C253 (Mauri et al., 1999). Another 
independent study purified and identified the S-layer subunits from Cdifficile GAI 0714 
(Takeoka et al., 1991). In both cases, the S-layer has been shown to be composed of two 

1 5 different protein subunits with apparent molecular weights of 36 kDa and 47 kDa (Cdifficile 
C253) and 32 kDa and 45 kDa (Cdifficile GAI 0714). The S-layer proteins from VPI 10463 
and strain 630 was here found to be similar in size but with significant pi differences. The N- 
terminal sequences varied significantly especially for the larger protein. The N-terminal 
sequences as determined for these proteins also indicate that they are not identical. Those 

20 from strain 630 appear to be processed products from the same gene (ORF1, Table 3). The N- 
terminal sequences of spot 1 from VPI 10643 did not find any homologue in the strain 630 
database corroborating earlier results that the higher molecular weight S-layer protein was 
sero-group specific (VPI 10643, group G and strain 630, group X) (Takeoka et al., 1991). Our 
current work indicated that this arrangement of two S-layer proteins was also true for all the 

25 different serotypes tested (data not shown). The spot 2 found a partial match with another 
ORF (ORF3) in the same contig as ORF1 . 

Other ORFs located in the same contig also had similarities with ORF1 and ORF3, 
whose C-terminal parts showed similarities to N-acetyl muramoyl L-alanine amidase 
(CwlB/LytC) and modifier protein of major autolysin (LytB) from Bacillus subtilis (Lazarevic 

30 et al., 1992), whereas the N-terminal part showed weak similarities to surface-layer proteins 
from Lactobacillus helveticus (Callegari et al., 1998) and Streptococcus spp. It is interesting 
to note that N-acetyl muramoyl L-alanine peptidoglycan amidase is the major autolysin of B. 
subtilis and has high affinity for cell walls, which is enhaced by the modifier protein, but 
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small amounts of cell free autolysin can be detected in cultures of J?, subtilis. Thus, the 
amidase-like motif that appears to be typical of C.dfficile S-layer proteins probably confers 
their anchorage to the cell wall peptidoglycan-teichoic acid. 

Considering the highly competitive situation of closely related organisms in their 
natural habitats, it is obvious that the S-layer surface has to contribute to diversification rather 
than to conservation. With respect to this, the importance of S-layer variation leading to the 
expression of alternative S-layer genes under different stress factors such as those imposed by 
the immune system of a host in response to an S-layered pathogen or drastic changes in the 
growth and environmental conditions for nonpathogens is conceivable (Dworkin and Blaser, 
1997; Sara et al., 1996). This could probably explain the variation in S-layer proteins even 
amongst the same species as in the case of C difficile. 

Identification of spot 4 as having similarity with XkdK, a protein encoded by the phage- 
like element PBSX from Bacillus subtilis (Krogh et al., 1996) and being located in a contig 
with other ORFs having similarity with other PBSX encoded proteins is very interesting. 
PBSX is a bacteriophage-like bacteriocin, or phibacin, of B. subtilis 168 (Okamoto et al., 
1968). When J5. subtilis 168 cells are exposed to treatments that induce the SOS response 
(such as UV light, mitomycin C), the cells lyse after incubation of lh and release particles of 
PBSX (Seaman et al., 1962). The spot 4 is completely absent in PYG supematants. Taken 
together this could indicate that toxin production (high in PY) and expression of this phage- 
like protein in C. difficile is a response to certain stress, environmental or otherwise, that 
decides whether it will resort to toxin expression, sporulation or both. 

The N-terminal sequences of spots 7 (41 kDa) and 8 (38 kDa) are identical (Table 3) 
and correspond to the same ORF (ORF3, Fig 1), whose N-terminal part is similar to N-acetyl 
muramoyl L-alanine amidase (CwlB/LytC) from B. subtilis. However, the size of the proteins 
in the gel do not match the size expected from the ORF3 (encodes a 67.5 kDa protein). Both 
ORF1 and ORF3 have clear signal sequences at the beginning which is missing in the protein 
spots sequenced, thus indicating that these are indeed secreted and processed following 
translation. This could also possibly explain the difference in size and pi for the two different 
spots, and the discrepancy between expected and observed molecular weights on SDS-gels. 

The spot 9 (24 kDa) has a N-terminal sequence (Table 1) which corresponds to an 
internal fragment of ORF1 . The expected size of this fragment is around 21 kDa which 
corresponds closely with what is observed experimentally. Clearly there are post-translational 
processing events which could be enacted in the cell envelope or in the supernatant. It is 
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important however to note that spots 7-9 are present in PYG supernatants only, when the cells 
start sporulating. 

The spots 9 and 10 are also processed products from ORF1 and are present in both PY 
supernatant and membrane fractions. However, these samples are obtained from strain 630. 
The results obtained thus far indicate that this operon (contig) (Fig 1) is present in both VPI 
10643 and strain 630, but different OKFs are expressed by the two strains. 

Experimental procedures 

Strains and growth medium 

The toxin-producing C difficile strain VPI 10463 (CCUG 19126, Culture Collection, 
University of Goteborg, Sweden) was grown in either PY, PYG (purchased from the 
Karolinska Hospital, Stockholm, Sweden) or SDM medium. SDM is identical to MADM 
(Karasawa et aL 9 1995; Yamakawa et al 9 1994; Yamakawa et al. 9 1996), except that the 
concentrations of glycine and threonine were 100 mg/L and 200 mg/L, respectively, and that 
Ca-D-panthotenate, pyridoxine and biotin were used as the sole vitamin sources. PY(G) was 
prepared by adding cysteine (500 mg/L), boiling for 20 min while purging with an anaerobic 
gas mixture (10% C0 2 , 10% H 2 , 80% for 20 min, sterilised by filtration (Acrodisc, 
Gelman sciences) and aliquoted into tubes with serum vial-style necks (Bellco Glass) while 
flushing with anaerobic gas. The tubes were closed with butyl stoppers secured with* 
aluminium crimp seals. SDM was prepared accordingly. 

Growth conditions, sampline and optical density measurements 

For each experiment, a tube containing 20 ml SDM was inoculated with 0.2 ml thawed 
bacterial suspension (stored at -70°C) using a syringe and a needle that was passed through 
the rubber septum of the tube. To avoid entry of 0 2 , the syringe was equilibrated with 
anaerobic gas before inoculation. The tube was put horizontally on a rotary shaker (50 rpm, 37 
°C), and on the next day, the culture was serially diluted into PY or PYG. On day three, 
samples were collected from the diluted cultures and OD was measured at 600 nm using a 
Hitachi U-l 100 spectrophotometer. 

Sample preparation and membrane fractionation 

Culture samples were centrifuged at 16000 x g for 3 min, whereafter the supernatants 
were removed, filtered, and stored at -20°C for later analysis. The pellet was frozen at -20°C 
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for 30 min or longer, thawed, dissolved in 1 ml sterile water and sonicated on ice for 3 x 30 s 
at 100 W (Labsonic 1510, B. Braun). Larger cell pellets, obtained from >1 ml culture, was 
sonicated for longer times. The cell extracts were centrifiiged at 5000 x g for 5 min. The pellet 
was separated as the low speed pellet (LSP), and the supernatant was further centrifiiged at 
50000 x g for 20 min. The pellet was separated as the high speed pellet (HSP), and the 
supernatant (soluble fraction) was stored at -20°C The LSP and the HSP were resuspended in 
lx PBS (Phosphate buffered saline). Protein amount was measured using a kit (Biorad) and a 
BSA standard curve according to the manufactureris instructions. 
The culture supernatants were precipitated using trichloroacetic acid (TCA) to a final 
concentration of 10%. The pellets were washed with ice-cold Acetone, air dried and finally 
resuspended in lx PBS to obtain the extracellular protein fraction. Protein estimation and 
analysis was carried out as described earlier. 

Immunoprecipitation 

Immunoprecipitation was performed in microliter wells coated with antibodies against 
toxin A (PCG-4, r-Biopharm) or toxin B (ra, r-Biopharm), Ten ng/ml antibody in 0.04 M 
Na^Oa, 0.06 M NaHC0 3 , pH 9.6 was added to microtiter wells and incubated for 1 h at 37°C 
and washed four times with PBS containing 0.05% (v/v) Tween-20, pH 7.4. The wells were 
loaded with cell extract, culture supernatant medium or PBS (negative control), incubated 90 
min at 25°C, and washed four times with PBS. After addition of 50 jil SDS sample buffer 
solution (below) and heating for 5 min at 95°C, the precipitated proteins were analysed by 
SDS-PAGE. 

SDS-polvacrvlamide gel electrophoresis ( SDS-PAGE) 

SDS-PAGE was performed using pre-cast polyacrylamide gels (ExcelGel 8-18% 
gradient gels, Pharmacia Biotech) and a Multiphor II horizontal slab gel apparatus (Pharmacia 
Biotech) according to the manuals provided by the manufacturer. The samples were mixed 1:1 
with SDS sample buffer solution (0.05 M Tris, 1% (w/v) SDS, 10 mM DTT, 0.01% (w/v) 
bromophenol blue, pH 7), incubated 5 min at 95°C, loaded onto the gels and run at 15°C. 
Chemicals were obtained from Sigma, and molecular weight markers from Pharmacia 
Biotech. The gels were stained with silver (PlusOne, Pharmacia Biotech) using a Hoefer 
automatic gel stainer (Pharmacia Biotech), digitised by scanning (Scanjet 3c/T, Hewlett- 
Packard), and transferred to ClarisDraw (Claris Software) on a Macintosh computer. 
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Immunoblotting 

Proteins were separated by SDS-PAGE transferred to polyvinylidene fluoride 
membranes (Immobilon P^, Millipore) using the Pharmacia Novablot transfer equipment 
and a continuous buffer system (39 mM glycine, 48 mM Tris, 0.0375% (w/v) SDS, 20% (v/v) 
methanol) according to the Multiphor II manual. The membranes were dried at 25°C for 1.5 h, 
blocked with 0.5% Tween-20 for 20 min, and then incubated with toxin A or toxin B 
antibodies (r-Biopharm, 0.2 jig/ml in TST buffer containing 0.05 M Tris, 0.5 M NaCl, 0.1% 
Tween-20, pH 9) for 1 h. After three washes in TST, the membranes were incubated with 
horse-radish peroxidase conjugated anti-mouse antibodies (DAKOPATTS, diluted 1:10000 in 
TST) for 1 h and washed three times in TST. A chemiluminiscent signal (ECL Plus, 
Amersham) was used to detect the bands. The relative amount of toxin B was measured on 
scanned x-ray films using the Molecular Analyst software (Biorad). • 

Two-dimensional polvacrvlamide gel electrophoresis (2-D PAGE) 

Protein samples were obtained as described under sample preparation. For 2-D PAGE, 
40 ml aliquots of each sample was mixed with 160 ml of buffer HI [9.9M Urea, 4% (v/v) 
Igepal CA630, 2.2% (v/v) ampholytes 3-10, lOOmM DTT, 2% (w/v) CHAPS]. Protein 
mixtures were focused at 20°C on 1 80mm IPG Drystrip pH 4-7 (Pharmacia Biotech) using the 
Multiphor II 2-D gel Kit according to the manufactureris instructions. The second dimension 
was run on 12% (linear) SDS acrylamide gels.-The gels were stained with silver (PlusOne, 
Pharmacia Biotech) using a Hoeffer automatic gel stainer (Pharmacia Biotech). The chemicals 
were obtained from Sigma except for Pharmalytes (Pharmacia Biotech). 

N-terminal sequencing 

For N-terminal Sequence determination, gels were transferred to Immobilon-P 
polyvinyline difluoride membrane (Millipore) as described under immunoblotting. The 
membrane was stained with Coomassie blue; protein spots were excised for sequence 
determination. The protein spots cut from the transfer membrane were washed four times in 
10% methanol and then dried and frozen. N-terminal sequence analysis was performed at the 
Protein Analysis Center, Karolinska Institute. Peptide sequences were matched against the 
Cdifficile genome database (Sanger Center, UK). 
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CLAIMS 

1 . A gene expression cassette comprising a secretory leader sequence encoding a 
signal peptide from Clostridium difficile having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID 
NO: 5, SEQ ID NO: 6, SEQ ID NO: 7 and signal peptides of analogous exported clostridial 
N-acetylmuramoyl-L-alanine amidase-like proteins, linked to a DNA sequence encoding a 
heterologous polypeptide. 

2. A gene expression cassette according to claim 1, wherein the signal peptides of the 
analogous clostridial N-acetylmuramoyl-L-alanine amidase-like proteins are selected from 
Clostridium difficile signal peptides having an amino acid sequence of any one of SEQ ID 
NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 1 1, and SEQ ID NO: 12. 

3. A gene expression cassette according to claim 1 or 2, which further includes a 
promoter of prokaryotic origin. 

4. A gene expression cassette according to claim 3, wherein the promoter is selected 
from clostridial promoters comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO: 13 - 21 . 

5 , A gene expression cassette according to any one of claims 1-4 which further includes 
a DNA sequence encoding at least a cell wall binding portion of a protein of prokaryotic 
origin functioning in Clostridia such that a fusion polypeptide may be presented on the outer 
surface of a host cell harbouring the cassette. 

6. A gene expression cassette according to any one of claims 1-4, which further 

includes a DNA sequence encoding at least a functional cell wall binding portion of an S-layer 
protein of G difficile selected from any one of the polypeptides having an amino acid 
sequence selected from SEQ ID NO: 22 - 33 such that a fusion polypeptide may be presented 
on the outer surface of a host cell harbouring the cassette. 
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7. A gene expression cassette according to claim 5 or 6, wherein DNA encoding the 
cell wall binding portions of SEQ ID NO: 22-33 has been omitted such that the fusion 
peptide is secreted into the surrounding milieu by the host cell harbouring the cassette. 

8. A gene expression cassette according to claim 6 or 7, wherein the DNA sequence 
encoding the heterologous peptide is inserted at a point downstream the first (signal) 
proteolytic cleavage site in the gene encoding a polypeptide having an amino acid sequence 
selected from SEQ ID NO: 22-33, optionally including or excluding its second cleavage site. 

9. A gene expression cassette according to any one of claims 1-8, which further 
comprises at least a functional part of a secretory (secA) gene recognizing the signal peptide, 
to allow translocation of a heterologous polypeptide and/or fusion polypeptide across the 
cytoplasmic membrane of a host cell harbouring the expression cassette. 

10. A gene expression cassette according to claim 9, wherein the secretory gene is from 
C. difficile and encodes a polypeptide having the amino acid sequence SEQ ED NO: 34. 

11. A gene expression cassette as shown in Figure 3. 

12. A vector comprising a gene expression cassette as claimed in any one of claims 1-11. 

13. A vector according to claim 12, wherein the vector is a plasmid. 

14. A host organism transformed with a vector according to claim 12 or 13 for 
expression of the heterologous polypeptide and/or fusion polypeptide. 

15. A Clostridium host organism transformed with a vector according to claim 12 or 13 
for expression of the heterologous polypeptide and/or fusion polypeptide. 

16. A host organism as claimed in claim 15 which is C. difficile. 



17. A host organism as claimed in claim 1 5 which is C. perfringens. 
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18. A pharmaceutical or veterinary composition or formulation which comprises 
Clostridial cells transformed with a vector according to claim 12 or 13, with the ability to 
present on the cell surface and/or to secrete an expressed heterologous polypeptide or fusion 
polypeptide, together with a pharmaceutically or veterinary acceptable carrier or diluent 

5 

19. A composition or formulation according to claim 1 8, which is suitable for oral or 
intranasal administration. 

20. A composition or formulation according to claim 1 8 or 1 9, which further comprises, 
10 as an adjuvant, non-toxic motifs of the C difficile toxins A and/or B that enable the expressed 

heterologous polypeptide and/or fusion polypeptide to enter the colonic mucosal cells of a 
mammal by receptor-mediated endocytosis, and/or a portion of toxin B responsible for its 
intracellular and intercellular spread. 

15 21 . A composition or formulation according to claim 1 8 or 19, which further comprises 
a further fused or separate carrier peptide or adjuvant, in addition to the expressed 
heterologous polypeptide and/or fusion polypeptide, to elicit a stronger or differently directed 
immune response than that against the expressed heterologous polypeptide acting alone. 

20 22. A vaccine which comprises a Clostridial bacterium transformed with a vector 

according to claim 12 or 13 and capable of presenting on the surface of the bacterium and/or 
secreting an antigen in a human or animal body, and optionally also an adjuvant described in 
claim 20 or 21. 

25 23. A medicament which comprises a Clostridial bacterium transformed with a vector 
according to claim 12 or 13 and capable of presenting on the surface of the bacterium and/or 
secreting a heterologous polypeptide and/or fusion polypeptide in a human or animal body, 
and optionally also an adjuvant described in claim 20 or 21. 

30 24. A vaccine according to claim 22, which comprises a mixture of at least two 

differently engineered Clostridia strains, each capable of presenting on the surface of the 
bacterium and/or secreting a different heterologous polypeptide and/or fusion polypeptide. 
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25 . A medicament according to claim 23, which comprises a mixture of at least two 
differently engineered Clostridia strains, each capable of presenting on the surface of the 
bacterium and/or secreting a different heterologous polypeptide and/or fusion polypeptide. 

26. A vaccine which comprises spores of Clostridia cells or bacteria transformed with a 
vector according to claim 12 or 13, and capable of germinating into cells which are able to 
grow, express, and present or secrete a heterologous polypeptide and/or fusion polypeptide, 
and optionally an adjuvant described in claim 20 or 21, in a mammalian body. 

27. A medicament which comprises spores of Clostridia cells or bacteria transformed 
with a vector according to claim 12 or 13, and capable of germinating into cells which are able 
to grow, express, and present or secrete a heterologous polypeptide and/or fusion polypeptide 
and optionally an adjuvant described in claim 20 or 21 , in a mammalian body. 

28. A vaccine according to claim 26, which comprises a mixture of spores from at least 
two differently engineered Clostridia strains. 

29. A medicament according to claim 27, which comprises a mixture of spores from at 
least two differently engineered Clostridia strains. 

30. A method for vaccination of a mammal, which comprises administering a 
therapeutically or prophylactically effective dose of a vaccine according to any one of claims 
22, 24, 26 and 28 to the mammal. 

31. A method for prophylactic or therapeutic treatment of a mammal, which comprises 
administering a therapeutically or prophylactically effective dose of a medicament according 
to any one of claims 23, 25, 27 and 29 to the mammal. 

32. A vaccine according to claim 26 or 28, a medicament according to claim 27 or 29, 
or a method according to claim 30 or 31, wherein the spores are from C difficile or C. 
perfringens. 
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33. A C. difficile-associated diarrhea (CDAD) vaccine comprising spores according to 
claim 26 or 32 and capable of expressing, after germination, 

(i) relevant parts of the C. difficile toxins, alone or together with a 

(ii) suitable adjuvant to provide primarily an IgA response to the toxin antigenic epitopes 
5 and/or 

(iii) S-layer protein antigenic variants (serotype antigens) or fimbrial antigens from C 
difficile to obtain, after administration to a mammal, a polyvalent anti-S-layer (or anti- 
fimbrial) IgA response preventing C. difficile colonization of the mammal. 
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SEQUENCE LISTING 

<110> Smittskyddsinstitutet 

<12 0> Gene expression cassette and its use 

<130> 110060300 

<140> 
<141> 

<150> SE 0002139-4 
<151> 2000-06-07 

<150> 0101479-4 
<151> 2001-04-26 

<160> 35 

<170> Patent In Ver. 2*1 

<210> 1 
<211> 29 
<212> PRT 

<213> Clostridium difficile 
<400> 1 

Met Asn Lys Lys Asn lie Ala lie Ala Met Ser Gly Leu Thr Val Leu 
15 10 15 

Ala Ser Ala Ala Pro Val Phe Ala Ala Thr Thr Gly Thr 
20 25 



<210> 2 
<211> 29 
<212> PRT 

<213> Clostridium difficile 
<400> 2 

Met Asn Lys Lys Asn Leu Ser Val lie Met Ala Ala Ala Met lie Ser 
15 10 15 

Thr Ser Val Ala Pro Val Phe Ala Ala Glu Thr Thr Gin 
20 25 



<210> 3 
<211> 32 
<212> PRT 

<213> Clostridium difficile 
<400> 3 

Met Lys lie Ser Lys Lys lie Val Ser Leu Leu Thr Met Thr Phe Leu 
1 5 10 15 



Thr Val Thr Leu Tyr Gly Asn Thr Ser Asn Ala Ser Thr Lys Asp Thr 
20 25 30 
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<210> 4 
<211> 34 
<212> PRT 

<213> Clostridium difficile 
<400> 4 

Met Arg Lys Tyr Lys Ser Lys Lys Leu Ser Lys Leu Leu Ala Leu Ser 
15 10 15 

Thr Val Cys Phe Leu lie Val Ser Thr lie Pro Val Ser Ala Glu Asn 
20 25 30 

His Lys 



<210> 5 
<211> 32 
<212> PRT 

<213> Clostridium difficile 
<400> 5 

Met Lys Ala Pro Lys Thr lie Leu Thr lie Leu Thr lie Ala Leu Thr 
1^5 10 15 

Leu Ser Ser He Ser He He Pro Ser Tyr Ala Leu Thr Glu Glu Lys 1 
20 25 30 



<210> 6 
<211> 35 
<212> PRT 

<213> Clostridium difficile 
<400> 6 

Met Arg Gly Asp Met Met Lys Lys Thr Thr Lys Leu Leu Ala Thr Gly 
1 5 10 15 

Met Leu Ser Val Ala Met Val Ala Pro Asn Val Ala Leu Ala Ala Glu 
20 25 30 

Asn Thr Thr 
35 



<210> 7 
<211> 31 
<212> PRT 

<213> Clostridium difficile 
<400> 7 

Met He Lys Lys He Ser Thr He Leu Ser Leu Val Leu Leu He Ser 
15 10 15 



He Ser Ser Thr He Gly Val Phe Ala Asp Ala Asn Pro Lys Arg 
20 25 30 
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<210> 8 
<211> 34 
<212> PRT 

<213> Clostridium difficile 
<400> 8 

Met Leu Ser Asn Lys Lys Arg Ser Met Ala lie Val Met Ala Gly Ala 
15 10 15 

Thr Val Met Ser Ala Ala Ala Pro lie Phe Ala Asp Asn Thr Val Thr 
20 25 30 

Glu Asn 



<210> 9 
<211> 41 
<212> PRT 

<213> Clostridium difficile 
<400>'9 

Met Lys Ser Thr Leu Gly Val Glu Asn Asn Met Lys Asn Ser Lys Lys 
1 5 10 15 

He Leu Ala He Gly Leu Thr Leu Phe Leu Val Met Val Asn Thr Pro 
20 25 30 

Met Val Ser Ala Leu Thr Ser Val Glu 
35 40 



<210> 10 
<211> 34 
<212> PRT 

<213> Clostridium difficile 
<400> 10 

Met Asn Lys Arg Lys Ser Phe He Arg Thr He Ala Val Ser Thr Met 
15 10 15 

Ala Val Ala Val Thr Gly Ser Ala Thr Cys Ala Tyr Ala Ala Pro Val 
20 25 30 

Leu Gin 



<210> 11 
<211> 47 
<212> PRT 

<213> Clostridium difficile 
<400> 11 

Met Glu Asn Asn His Asn He Asn He Lys Tyr Lys Asn His Gin Gly 
1 5 10 15 

Asp Met Lys Met Asn Lys Lys He Leu Ser Leu Gly Leu Ala Val Ser 
20 25 * 30 
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Leu He Leu Val Asn Phe Lys Ser Val Asn Ala Ser Ser Val Val 
35 40 45 



<210> 12 
<211> 32 
<212> PRT 

<213> Clostridium difficile 
<400> 12 

Met Lys Val Asn Lys Arg Val Leu Ser He Gly Leu Ala He Ser Leu 
1 5 10 15 

He Met Ala Gly Ala Pro Asn He Asn Ala Leu Ser Ser ,Ile Glu Lys 
20 25 30 



<210> 13 
<211> 50 
<212> DNA 

<213> Clostridium difficile 
<400> 13 

tagtttatta cattttaaaa tttagggtat aaaaacttgt aaacttggag 50 



<210> 14 
<211> 50 
<212> DNA 

<213> Clostridium difficile 
<400> 14 

cattttaaaa tttagggtat aaaaacttgt- aaacttggag aaaataataa 50 



<210> 15 
<211> 50 
<212> DNA 

<213> Clostridium difficile 
<400> 15 

aacttgtaaa cttggagaaa ataataattt aaaaaaatag cttgcaaaaa 50 



<210> 16 
<211> 50 
<212> DNA 

<213> Clostridium difficile 
<400> 16 

aacttggaga aaataataat ttaaaaaaat agcttgcaaa aagaataaaa 50 



<210> 17 

<211> 50 

<212> DNA 

<213> Clostridium 



difficile 



<400> 17 

taatttaaaa aaatagcttg caaaaagaat aaaaatggat tattatagag 



50 
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<210> 18 
<211> 50 
<212> DNA 

<213> Clostridium difficile 
<400> 18 

aatagcttgc aaaaagaata aaaatggatt attatagaga tgtgagaaat 50 



<210> 19 
<211> 50 
<212> DNA 

<213> Clostridium difficile 
<400> 19 

tattatagag atgtgagaaa tattaggaat atatggatga ttattctatg 50 



<210> 20 
<211> 50 
<212> DNA 

<213> Clostridium difficile 
<400> 20 

gatgtgagaa atattaggaa tatatggatg attattctat gtacataata . 50 



<210> 21 
<211> 50 
<212> DNA 

<213> Clostridium difficile 
<400> 21 

gattattcta tgtacataat aaagagatgt aattttaata taatgttggg 50 



<210> 22 
<211> 719 
<212> PRT 

<213> Clostridium difficile 
<400> 22 

Met Asn Lys Lys Asn lie Ala lie Ala Met Ser Gly Leu Thr Val Leu 
15 10 15 

Ala Ser Ala Ala Pro Val Phe Ala Ala Thr Thr Gly Thr Gin Gly Tyr 
20 25 30 

Thr Val Val Lys Asn Asp Trp Lys Lys Ala Val Lys Gin Leu Gin Asp 
35 40 45 

Gly Leu Lys Asp Asn Ser lie Gly Lys lie Thr Val Ser Phe Asn Asp 
50 55 " 60 

Gly Val Val Gly Glu Val Ala Pro Lys Ser Ala Asn Lys Lys Ala Asp 
65 70 75 80 



Arg Asp Ala Ala Ala Glu Lys Leu Tyr Asn Leu Val Asn Thr Gin Leu 
85 90 95 
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Asp Lys Leu Gly Asp Gly Asp Tyr Val Asp Phe Ser Val Asp Tyr Asn 
100 " 105 110 

Leu Glu Asn Lys lie lie Thr Asn Gin Ala Asp Ala Glu Ala He Val 
115 X20 125 

Thr Lys Leu Asn Ser Leu Asn Glu Lys Thr Leu He Asp He Ala Thr 
130 135 140 

Lys Asp Thr Phe Gly Met Val Ser Lys Thr Gin Asp Ser Glu Gly Lys 
145 150 155 160 

Asn Val Ala Ala Thr Lys Ala Leu Lys Val Lys Asp Val Ala Thr Phe 
165 170 175 

Gly Leu Lys Ser Gly Gly Ser Glu Asp Thr Gly Tyr Val Val Glu Met 
180 185 190 

Lys Ala Gly Ala Val Glu Asp Lys Tyr Gly Lys Val Gly Asp Ser Thr 
195 200 205 

Ala Gly He Ala He Asn Leu Pro Ser Thr Gly Leu Glu Tyr Ala Gly 
210 215 220 

Lys Gly Thr Thr He Asp Phe Asn Lys Thr Leu Lys Val Asp Val Thr 
225 230 235 240 

Gly Gly Ser Thr Pro Ser Ala Val Ala Val Ser Gly Phe Val Thr Lys 
245 250 255 

Asp Asp Thr Asp Leu Ala Lys Ser Gly Thr He Asn Val Arg Val He 
260 265 270 

Asn Ala Lys Glu Glu Ser He Asp He Asp Ala Ser Ser Tyr Thr Ser 
275 280 285 

Ala Glu Asn Leu Ala Lys Arg Tyr Val Phe Asp Pro Asp Glu He Ser 
290 295 ~ 300 

Glu Ala Tyr Lys Ala He Val Ala Leu Gin Asn Asp Gly He Glu Ser 
305 310 315 320 

Asn Leu Val Gin Leu Val Asn Gly Lys Tyr Gin Val He Phe Tyr Pro 
325 330 335 

Glu Gly Lys Arg Leu Glu Thr Lys Ser Ala Asn Asp Thr He Ala Ser 
340 345 350 

Gin Asp Thr Pro Ala Lys Val Val He Lys Ala Asn Lys Leu Lys Asp 
355 360 365 

Leu Lys Asp Tyr Val Asp Asp Leu Lys Thr Tyr Asn Asn Thr Tyr Ser 
370 375 380 

Asn Val Val Thr Val Ala Gly Glu Asp Arg He Glu Thr Ala He Glu 
385 390 395 400 

Leu Ser Ser Lys Tyr Tyr Asn Ser Asp Asp Lys Asn Ala He Thr Asp 
405 ^ 410 415 
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Lys Ala Val Asn Asp lie Val Leu Val Gly Ser Thr Ser lie Val Asp 
420 425 430 

Gly Leu Val Ala Ser Pro Leu Ala Ser Glu Lys Thr Ala Pro Leu Leu 
435 440 445 

Leu Thr Ser Lys Asp Lys Leu Asp Ser Ser Val Lys Ser Glu lie Lys 
450 455 460 

Arg Val Met Asn Leu Lys Ser Asp Thr Gly lie Asn Thr Ser Lys Lys 
465 470 475 480 

Val Tyr Leu Ala Gly Gly Val Asn Ser lie Ser Lys Asp Val Glu Asn 
485 490 495 

Glu Leu Lys Asn Met Gly Leu Lys Val Thr Arg Leu Ser Gly Glu Asp 
500 505 510 

Arg Tyr Glu Thr Ser Leu Ala lie Ala Asp Glu lie Gly Leu Asp Asn 
515 520 525 

Asp Lys Ala Phe Val Val Gly Gly Thr Gly Leu Ala Asp Ala. Met Ser 
530 535 540 

lie Ala Pro Val Ala Ser Gin Leu Lys Asp Gly Asp Ala Thr Pro lie 
545 550 555 560 

Val Val Val Asp Gly Lys Ala Lys Glu lie Ser Asp Asp Ala Lys Ser 
565 570 575 

Phe Leu Gly Thr Ser Asp Val Asp lie lie Gly Gly Lys Asn Ser Val 
580 585 590 

Ser Lys Glu lie Glu Glu Ser lie Asp Ser Ala Thr Gly Lys Thr Pro 
595 600 605 

Asp Arg lie Ser Gly Asp Asp Arg Gin Ala Thr Asn Ala Glu Val Leu 
610 615 620 

Lys Glu Asp Asp Tyr Phe Thr Asp Gly Glu Val Val Asn Tyr Phe Val 
625 630 635 640 

Ala Lys Asp Gly Ser Thr Lys Glu Asp Gin Leu Val Asp Ala Leu Ala 
645 650 655 

Ala Ala Pro lie Ala Gly Arg Phe Lys Glu Ser Pro Ala Pro lie lie 
660 665 670 

Leu Ala Thr Asp Thr Leu Ser Ser Asp Gin Asn Val Ala Val Ser Lys 
675 680 685 

Ala Val Pro Lys Asp Gly Gly Thr Asn Leu Val Gin Val Gly Lys Gly 
690 695 700 

He Ala Ser Ser Val He Asn Lys Met Lys Asp Leu Leu Asp Met 
705 710 715 



<210> 23 
<211> 623 
<212> PRT 
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<213> Clostridium difficile 
<400> 23 

Met Asn Lys Lys Asn Leu Ser Val lie Met Ala Ala Ala Met lie Ser 
15 10 15 

Thr Ser Val Ala Pro Val Phe Ala Ala Glu Thr Thr Gin Val Lys Lys 
20 25 - 30 

Glu Thr lie Thr Lys Lys Glu Ala Thr Glu Leu Val Ser Lys Val Arg 
35 40 45 

Asp Leu Met Ser Gin Lys Tyr Thr Gly Gly Ser Gin Val Gly Gin Pro 
50 55 60 

lie Tyr Glu lie Lys Val Gly Glu Thr Leu Ser Lys Leu Lys lie He 
65 70 75 80 

Thr Asn He Asp Glu Leu Glu Lys Leu Val Asn Ala Leu Gly Glu Asn 
85 90 95 

Lys Glu Leu He Val Thr He Thr Asp Lys Gly His He Thr Asn Ser 
100 105 * 110 

Ala Asn Glu Val Val Ala Glu Ala Thr Glu Lys Tyr Glu Asn Ser Ala 
115 120 125 

Asp Leu Ser Ala Glu Ala Asn Ser He Thr Glu Lys Ala Lys Thr Glu 
130 135 140 

Thr Asn Gly He Tyr Lys Val Ala Asp Val Lys Ala Ser Tyr Asp- Ser 
145 150 155 160 

Ala Lys Asp Lys Leu Val He Thr Leu Arg Asp Lys Thr Glu Thr Val 
165 170 175 

Thr Ser Lys Thr He Tyr Val Gly He Gly Asp Glu Lys Val Asp Leu 
180 185 190 

Thr Ala Asn Pro Val Asp Ser Thr Gly Thr Asn Leu Asp Pro Ser Ala 
195 200 205 

Glu Gly Phe Arg Val Asn Lys He Asp Lys Leu Gly Val Ala Gly Ala 
210 215 220 

Lys Asn He Asp Asp Val Gin Leu Ala Glu He Thr He Lys Asn Ser 
225 230 235 240 

Asp Leu Asn Thr Val Ser Pro Gin Asp Leu Tyr Asp Gly Tyr Arg Leu 
245 w 250 ~ 255 

Thr Val Lys Gly Asn Met Val Ala Asn Gly Thr Ser Lys Ser lie Ser 
260 265 270 

Asp He Ser Ala Lys Asp Ser Glu Thr Gly Lys Tyr Lys Phe Thr He 
275 280 * 285 

Lys Tyr Thr Asp Ala Ser Gly Lys Ala Thr Glu Leu Thr Val Glu Ser 
290 295 300 
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Thr Asn Glu Lys Asp Leu Lys Asp Ala Lys Ala Ala Leu Glu Gly Asn 
305 310 315 320 

Ser Lys Val Lys Leu lie Ala Gly Asp Asp Arg Tyr Ala Thr Ala Val 
325 330 335 

Ala He Ala Lys Gin Thr Lys Tyr Thr Asp Asn He Val He Val Asn 
340 345 350 

Ser Asn Lys Leu Val Asp Gly Leu Ala Ala Thr Pro Leu Ala Gin Ser 
355 360 365 

Lys Lys Ala Pro He Leu Leu Ala Ser Asp Asn Glu He Pro Lys Val 
370 375 380 

Thr Leu Asp Tyr He Lys Asp He He Lys Lys Ser Pro Ser Ala Lys 
385 390 395 400 

He Tyr He Val Gly Gly Glu Ser Ala Val Ser Asn Thr Ala Lys Lys 
405 410 415 

Gin Leu Glu Ser Val Thr Lys Asn He Glu Arg Leu Ala Gly Asp Asp 
420 425 430 

Arg His Thr Thr Ser Val Ala Val Ala Lys Ala Met Gly Ser Phe Lys 
435 440 445 

Asp Ala Phe Val Val Gly Ala Lys Gly Glu Ala Asp Ala Met Ser He 
450 455 460 

Ala Ala Lys Ala Ala Glu Leu Lys Ala Pro He He Val Asn Gly Trp 
465 470 475 480 

Asn Asp Leu Ser Ala Asp Ala He Lys Leu Met Asp Gly Lys Glu He 
485 490 495 

Gly He Val Gly Gly Ser Asn Asn Val Ser Ser Gin He Glu Asn Gin 
500 505 510 

Leu Ala Asp He Asp Lys Asp Arg Lys Val Gin Arg Val Glu Gly Glu 
515 520 525 

Thr Arg His Asp Thr Asn Ala Lys Val He Glu Thr Tyr Tyr Gly Lys 
530 535 540 

Leu Asp Lys Leu Tyr He Ala Lys Asp Gly Tyr Gly Asn Asn Gly Met 
545 550 555 560 

Leu Val Asp Ala Leu Ala Ala Gly Pro Leu Ala Ala Gly Lys Gly Pro 
565 570 575 

He Leu Leu Ala Lys Thr Asp He Thr Asp Ser Gin Lys Asn Ala Leu 
580 585 590 

Ser Lys Lys Leu Asn Leu Gly Ala Glu Val Thr Gin He Gly Asn Gly 
595 600 605 

Val Glu Leu Thr Val He Gin Lys He Ala Lys He Leu Gly Trp 
610 615 620 



<210> 24 
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<211> 610 
<212> PRT 

<213> Clostridium difficile 
<400> 24 

Met Lys He Ser Lys Lys He Val Ser Leu Leu Thr Met Thr Phe Leu 
15 10 15 

Thr Val Thr Leu Tyr Gly Asn Thr Ser Asn Ala Ser Thr Lys Asp Thr 
20 25 30 

Leu Thr Gly Ser Gly Arg Trp Glu Thr Ala He Lys He Ser Gin Ala 
35 40 45 

Gly Trp Thr Lys Ser Glu Ser Ala Val Leu Val Asn Asp Asn Ser He 
50 55 60 

Ala Asp Ala Leu Ser Ala Thr Pro Phe Ala Lys Ala Lys Asp Ala Pro 
65 70 75 80 

He Leu Leu Thr Gin Ser Asn Lys Leu Asp Ser Arg Thr Lys Ala Glu 
85 90 95 

Leu Lys Arg Leu Gly Val Lys Asn Val Tyr Leu He Gly Gly Ser He 
100 105 110 

Ala Leu Ser Ser Glu He Glu Lys Gin Leu Asn Ala Glu Asn He Asn 
115 120 125 

Phe Glu Arg He Ser Gly Asn Ser Arg Tyr Asp Thr Ser Leu Lys Leu 
130 135 ~ 140 

Ala Glu Lys Leu Asp Arg Glu Lys Ser He Ser Lys He Val Val Val 
145 150 * 155 ~ 160 

Asn Gly Glu Lys Gly Leu Ala Asp Ala Val Ser Val Gly Ala He Ala 
165 170 175 

Ala Gin Glu Asn Met Pro He He Leu Ser Asp Ser Glu Asn Gly Thr 
180 185 190 

Glu Val Ala Asp Asn Phe He Asp Ser Lys Asp He Ala Lys Ser Tyr 
195 200 205 

Val He Gly Gly Thr Tyr Ser He Ser Asn Ser Val Glu Arg Ser Leu 
210 215 220 

Pro Asn Ala Thr Arg He Ala Gly Ser Ser Arg Ser Glu Thr Asn Ala 
225 230 235 240 

Lys He He Glu Glu Phe Tyr Lys Asp Thr Asp He Lys Asn He Tyr 
245 250 255 

Val Thr Lys Asp Gly Thr Lys Asn Lys Asn Asp Leu He Asp Ser Leu 
260 265 270 

Ala Val Gly Val Leu Ala Ala Lys Asn Ser Ser Pro He Val Leu Ala 
275 280 285 

Gly Asn Lys Leu Asp Thr Thr Gin Lys Asp Val Leu Asn Thr Lys He 
290 295 300 
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lie Asp Lys Val Thr Gin lie Gly Gly Leu Gly Asn Glu Asn Val Val 
3 05 310 315 32 0 

Glu Asp lie Leu Asp lie Gin Glu Glu Thr Lys Tyr Thr Val Glu Thr 
325 330 335 

lie Asp Glu Leu Asn Ala Ala lie Lys Arg Ala Asp Ala Asn Asp He 
340 345 350 

He Lys Phe Lys Pro Glu Lys Glu Lys Thr He Asn Asn Ser Phe Ser 
355 * 360 * 365 

He Glu Thr Lys Lys Thr Val Thr He Glu Leu Asp Gly Arg Tyr Arg 
370 375 380 

Gin Thr He Thr Leu Asp He Pro Asn Gly Lys Phe Asn Asn Tyr Ala 
385 390 395 400 

Glu He Glu Gly Gly Val Lys Leu Lys Asn He Lys Asn Glu Ser Leu 
405 410 415 

Val Asn Lys Gly Ser He Gin Asp Leu Asp He Tyr Asp Glu Asn Gly 
420 425 430 

Cys Lys He Glu Asn Glu Ser Ser Gly Glu He Trp Phe Val Thr He 
435 440 445 

Val Glu Glu Ala Asn Asp Val Tyr He Val Asn Ser Gly Asp He Thr 
450 455 460 

Lys lie Ser Asn Asn Ser Ser Ser Thr He He Arg Asn Ser Gly Asn 
465 470 475 480 

He Asp Thr Val Thr Gly Lys Lys Glu Pro Ala He Ser Gly Asn Lys 
485 ' 490 495 

Pro Lys Val Asn Asp Thr Glu Lys Glu Thr Lys Ala Ala Arg Gly Leu 
500 505 510 

Asn Pro Arg Val Glu Ala Cys Ser Val Pro Lys Lys Asp Tyr Val Met 
515 520 525 

He Thr He Pro Asn Ser Pro Lys Asp Ser Arg Tyr Lys He Tyr Tyr 
530 535 540 

Arg Val Val Tyr Asn Lys Pro Tyr Ala Met Asp Val Gly Asp Lys He 
545 550 555 560 

Asn He Gly Glu Trp Thr Val Ala Pro Thr Asp Glu Glu Pro Phe Leu 
565 570 575 

Glu Lys Ala Lys Asn Gly Cys Tyr Val Glu Ala Val Glu Val Asn Thr 
580 585 590 

Ser Thr Lys Glu Val Ser Arg Trp Gly Arg Thr Asn Ala Thr Asp Asp 
595 600 605 

Gly Phe 
610 
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<210> 25 
<211> 803 
<212> PRT 

<213> Clostridium difficile 
<400> 25 

Met Arg Lys Tyr Lys Ser Lys Lys Leu Ser Lys Leu Leu Ala Leu Ser 
15 10 15 

Thr Val Cys Phe Leu He Val Ser Thr He Pro Val Ser Ala Glu Asn 
20 25 30 

His Lys Thr Leu Asp Gly Val Glu Thr Ala Glu Tyr Ser Glu Ser Tyr 
35 ^ 40 45 

Leu Gin Tyr Leu Glu Asp Val Lys Asn Gly Asp Thr Ala Lys Tyr Asn 
50 55 60 

Gly Val He Pro Phe Pro His Glu Met Glu Gly Thr Thr Leu Arg Asn 
65 70 75 80 

Lys Gly Arg Ser Ser Leu Pro Ser Ala Tyr Lys Ser Ser Val Ala Tyr 
85 90 95 

Asn Pro Met Asp Leu Gly Leu Thr Thr Pro Ala Lys Asn Gin Gly Ser 
100 105 110 

Leu Asn Thr Cys Trp Ser Phe Ser Gly Met Ser Thr Leu Glu Ala Tyr 
115 120 125 

Leu Lys Leu Lys Gly Tyr Gly Thr Tyr Asp Leu Ser Glu Glu His Leu 
130 135 140 

Arg Trp Trp Ala Thr Gly Gly Lys Tyr Gly Trp Asn Leu Asp Asp Met 
145 150 155 160 

Ser Gly Ser Ser Asn Val Thr Ala He Gly Tyr Leu Thr Ala Trp Ala 
165 170 175 

Gly Pro Lys Leu Glu Lys Asp He Pro Tyr Asn Leu Lys Ser Glu Ala 
180 185 190 

Gin Gly Ala Thr Lys Pro Ser Asn Met Asp Thr Ala Pro Thr Gin Phe 
195 200 205 

Asn Val Thr Asp Val Val Arg Leu Asn Lys Asp Lys Glu Thr Val Lys 
210 215 220 

Asn Ala He Met Gin Tyr Gly Ser Val Thr Ser Gly Tyr Ala His Tyr 
225 230 235 240 

Ser Thr Tyr Phe Asn Lys Asp Glu Thr Ala Tyr Asn Cys Thr Asn Lys 
245 250 255 

Arg Ala Pro Leu Asn His Ala Val Ala He Val Gly Trp Asp Asp Asn 
260 265 270 

Tyr Ser Lys Asp Asn Phe Ala Ser Asp Val Lys Pro Glu Ser Asn Gly 
275 280 285 
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Ala Trp Leu Val Lys Ser Ser Trp Gly Glu Phe Asn Ser Met Lys Gly 
290 295 300 

Phe Phe Trp He Ser Tyr Glu Asp Lys Thr Leu Leu Thr Asp Thr Asp 
305 310 315 320 

Asn Tyr Ala Met Lys Ser Val Ser Lys Pro Asp Ser Asp Lys Lys Met 
325 330 335 

Tyr Gin Leu Glu Tyr Ala Gly Leu. Ser Lys He Met Ser Asn Lys Val 
340 345 350 

Thr Ala Ala Asn Val Phe Asp Phe Ser Arg Asp Ser Glu Lys Leu Asp 
355 360 365 

Ser Val Met Phe Glu Thr Asp Ser Val Gly Ala Lys Tyr Glu Val Tyr 
370 375 380 

Tyr Ala Pro Val Val Asn Gly Val Pro Gin Asn Asn Ser Met Thr Lys 
385 ' 390 395 400 

Leu Ala Ser Gly Thr Val Ser Tyr Ser Gly Tyr He Asn Val Pro Thr 
405 * 410 415 

Asn Ser Tyr Ser Leu Pro Lys Gly Lys Gly Ala He Val Val Val He 
420 ~ 425 430 

Asp Asn Thr Ala Asn Pro Asn Arg Glu Lys Ser Thr Leu Ala Tyr Glu 
435 440 445 

Thr Asn He Asp Ala Tyr Tyr Leu Tyr Glu Ala Lys Ala Asn Leu Gly 
450 455 " 460 

Glu Ser Tyr He Leu Gin Asn Asn Lys Phe Glu Asp He Asn Thr Tyr 
465 470 475 480 

Ser Giu Phe Ser Pro Cys Asn Phe Val He Lys Ala He Thr Lys Thr 
485 490 495 

Ser Ser Gly Gin Ala Thr Ser Gly Glu Ser Leu Thr Gly Ala Asp Arg 
500 505 510 

Tyr Glu Thr Ala Val Lys Val Ser Gin Lys Gly Trp Thr Ser Ser Gin 
515 520 525 

Asn Ala Val Leu Val Asn Gly Asp Ala lie Val Asp Ala Leu Thr Ala 
530 535 540 

Thr Pro Phe Thr Ala Ala He Asp Ser Pro He Leu Leu Thr Gly Lys 
545 550 * 555 560 

Asp Asn Leu Asp Ser Lys Thr Lys Ala Glu Leu Gin Arg Leu Gly Thr 
565 570 575 

Lys Lys Val Tyr Leu He Gly Gly Glu Asn Ser Leu Ser Lys Asn Val 
580 585 590 

Gin Thr Gin Leu Ser Asn Met Gly He Ser Val Glu Arg He Ser Gly 
595 600 605 
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Ser Asp Arg Tyr Lys Thr Ser lie Ser Leu Ala Gin Lys Leu Asn Ser 
610 ~ 615 620 

He Lys Ser Val Ser Gin Val Ala Val Ala Asn Gly Val Asn Gly Leu 
625 630 635 640 

Ala Asp Ala He Ser Val Gly Ala Ala Ala Ala Asp Asn Asn Met Pro 
645 650 655 

He He Leu Thr Asn Glu Lys Ser Glu Leu Gin Gly Ala Asp Glu Phe 
660 665 670 

Leu Asn Ser Ser Lys He Thr Lys Ser Tyr He He Gly Gly Thr Ala 
675 680 685 

Thr Leu Ser Ser Asn Leu Glu Ser Lys Leu Ser Asn Pro Thr Arg Leu 
690 695 700 

Ala Gly Ser Asn Arg Asn Glu Thr Asn Ala Lys He He Asp Lys Phe 
705 * 710 715 720 

Tyr Pro Ser Ser Asp Leu Lys Tyr Ala Phe Val Val Lys Asp Gly Ser 
725 730 735 

Lys Ser Gin Gly Asp Leu He Asp Gly Leu Ala Val Gly Ala Leu Gly 
740 745 750 

Ala Lys Thr Asp Ser Pro Val Val Leu Val Gly Asn Lys Leu Asp Glu 
755 760 7.65 

Ser Gin Lys Asn Val Leu Lys Ser Lys Lys He Glu Thr Pro lie Arg 
770 775 780 

Val Gly Gly Asn Gly Asn Glu Ser Ala Phe Asn Glu Leu Asn Thr Leu 
785 790 795 800 

Leu Gly Lys 



<210> 26 
<211> 525 
<212> PRT 

<213> Clostridium difficile 
<400> 26 

Met Lys Ala Pro Lys Thr He Leu Thr He Leu Thr He Ala Leu Thr 
15 10 15 

Leu Ser Ser He Ser He He Pro Ser Tyr Ala Leu Thr Glu Glu Lys 
20 25 30 

Leu He Gly Asn Gly Arg Tyr Glu Thr Ala Val Lys He Ser Gin Lys 
35 40 45 

Ala Tyr Ser Ser Ser Pro Asn Val Val Leu Val Asn Asp Asn Ser Leu 
50 55 60 



Ala Asp Ala Leu Ser Ala 
65 70 



Thr Pro Phe Ala Lys Ala Lys Gly Ala Pro 
75 80 
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He Leu Leu Thr Glu Ser Asp Lys Leu Asp Asp Arg Thr Glu Lys Glu 
85 90 95 

He Lys Arg Leu Gly Ala Lys Asp He Tyr Leu He Gly Gly Thr Ala 
100 105 110 

Val Leu Asn Lys Asp He Glu Asn Lys Leu Lys Gly Asn Gly Leu Asn 
115 120 ~ ~ 125 

Val Glu Arg He Asn Gly Lys Asn Arg Tyr Glu Thr Ser Leu He Leu 
130 135 140 

Ala Asn Lys Leu Lys Asp He Lys Asp He Lys Glu Val Ala Val Val 
145 150 * 155 160 

Asn Gly Glu Lys Gly Leu Ser Asp Ala Val Ser Val Gly Ala Pro Ala' 
165 170 175 

Ala Gin Asn Lys Met Pro He He Leu Ser Asn Pro Lys Asp Gly Val 
180 185 190 

Glu Ala Phe Asp Lys Phe He Arg Asp Glu Lys Val He Lys Ala Tyr 
195 200 205 

Val He Gly Gly Thr Ash Ser Val Ser Arg Ala Val Glu Lys Ser Leu 
210 215 • 220 

Pro Asn Ala Glu Arg Val Ser Gly Lys Asp Arg Asn Glu Thr Asn Ala 
225 230 " *~ 235 240 

Lys Val He Glu Lys Phe Tyr Thr Asp Thr Asn Leu Ser Asn Leu Tyr 
245 ~ 250 255 

Val Thr Lys Asp Gly Ser Lys Asn Glu Asn Gin Leu He Asp Ser Leu 
260 265 270 

Ala Val Gly Val Leu Ala Ala Lys Asn Glu Ser Pro He Val Leu Val 
275 280 285 

Gly Asn Lys Leu Asn Thr Lys Gin Arg Asp He Leu Ser Thr Lys Lys 
290 295 300 

Leu Asn Thr He Thr Gin Val Gly Gly Asn Gly Asn Glu Glu Ala Phe 
305 310 315 320 

Asp Glu He Lys Ser Leu Gin Glu Lys Thr Val Phe Glu Ala Lys Thr 
325 ~ 330 335 

Val Glu Glu Leu Thr Asp Met lie Asn He Ala Ser Pro Asn Asp He 
340 -345 350 

He Asn Phe Lys Pro Lys Glu Asn Thr Val Asn Glu Ala Phe Arg Met 
355 360 365 

Val Thr Asn Lys Pro He Thr Val Asn He Lys Gly Asp Cys Ser Lys 
370 375 380 

Thr Leu Thr Val Asp Met Pro Asn Gly Glu Val Asn Asn Tyr Ala Thr 
385 390 395 400 
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Leu Val Asn Val lie Val Arg- Asn He Gly Glu Gly Gly Phe Asn Asn 
405 410 415 

His Asp Thr He Thr He Leu Ser Val Arg Asp Lys Asn Gly Arg Val 
420 425 430 

He Glu Asn Thr Arg Asn Ser Asp He Asp Thr Leu Met He Leu Ala 
435 440 445 

Ser Ala Asn Asp Thr Lys Leu He Asn Asp Gly Tyr He Gly Lys Leu 
450 455 460 

He Asp Asn Ser Ser Asn Ser Asp lie Thr Asn Asn Gly Thr lie Asp 
465 470 475 480 

Lys Lys Val Asn Gin Val Glu Asp Leu Glu Ala Lys Val Asp Ser He 
485 490 495 

Glu Lys Ala He Asp Ser He Ser Gin Lys Val Asn Lys He Gin Asp 
500 505 510 

He Leu Asp Lys Leu Gly Phe Leu Lys Lys Phe Leu Ser 
515 " 520 525 



<210> 27 
<211> 679 
<212> PRT 

<213> Clostridium difficile 
<400> 27 

Met Arg Gly Asp Met Met Lys Lys Thr Thr Lys Leu Leu Ala Thr Gly 
1 5 10 15 

Met Leu Ser Val Ala Met Val Ala Pro Asn Val Ala Leu Ala Ala Glu 
20 25 30 

Asn Thr Thr Ala Asn Thr Glu Ser Asn Ser Asp He Asn He Asn Leu 
35 40 45 

Gin Arg Lys Ser Val Val Leu Gly Ser Lys Ser Asn Ala Ser Val Lys 
50 55 60 

Phe Lys Glu Lys Leu Asn Ala Asp Ser He Thr Leu Asn Phe Met Cys 
65 70 75 80 

Tyr Asp Met Pro Leu Glu Ala Thr Leu Asn Tyr Asn Glu Lys Thr Asp 
85 90 95 

Ser Tyr Glu Gly Val He Asn Tyr Asn Lys Asp Pro Glu Tyr Leu Asn 
100 105 110 

Val Trp Glu Leu Gin Ser He Lys He Asn Gly Lys Asp Glu Gin Lys 
115 120 125 

Val Leu Asn Lys Glu Asp Leu Glu Ser Met Gly Leu Asn Leu Lys Asp 
130 135 140 



Tyr Asp Val Thr Gin Glu Phe He He Ser Asp Ala Asn Ser Thr Lys 
145 150 155 160 
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Ala Val Asn Glu Tyr Met Arg Lys Thr Ser Ala Pro Val Lys Lys Leu 
165 170 175 

Ala Gly Ala Thr Arg Phe Glu Thr Ala Val Glu He Ser Lys Gin Gly 
180 185 190 

Trp Lys Asp Gly Ser Ser Lys Val Val He Val Asn Gly Glu Leu Ala 
195 200 205 

Ala Asp Gly He Thr Ala Thr Pro Leu Ala Ser Thr Tyr Asp Ala Pro 
210 215 220 

He Leu Leu Ala Asn Lys Asp Asp He Pro Glu Ser Thr Lys Ala Glu 
225 230 235 240 

Leu Lys Arg Leu Asn Pro Ser Asp Val He He He Gly Asp Asp Gly 
245 250 255 

Ser Val Ser Gin Lys Ala Val Ser Gin He Lys Ser Ala Val Asn Val 
260 265 270 

Asn Val Thr Arg He Gly Gly Val Asp Arg His Glu Thr Ser Leu Leu 
275 280 285 

He Ala Lys Glu He Asp Lys Tyr His Asp Val Asn Lys He Tyr He 
290 295 300 

Ala Asn Gly Tyr Ala Gly Glu Tyr Asp Ala Leu Asn He Ser Ser Lys 
305 310 315 320 

Ala Gly Glu Asp Gin Gin Pro He He Leu Ala Asn Lys Asp Ser Val 
325 330 335 

Pro Gin Gly Thr Tyr Asn Trp Leu Ser Ser Gin Gly Leu Glu Glu Ala 
340 " 345 ~ 350 

Tyr Tyr He Gly Gly Ser Gin Ser Leu Ser Ser Lys He He Asp Gin 
355 360 365 

He Ser Lys He Ala Lys Asn Gly Thr Ser Lys Asn Arg Val Ser Gly 
370 375 380 

Ala Asp Arg His Glu Thr Asn Ala Asn Val lie Lys Thr Phe Tyr Pro 
385 390 395 400 

Asp Lys Glu Leu Ser Ala Met Leu Val Ala Lys Ser Asp He He Val 
405 410 415 

Asp Ser He Thr Ala Gly Pro Leu Ala Ala Lys Leu Lys Ala Pro He 
420 425 430 

Leu He Thr Pro Lys Thr Tyr Val Ser Ala Tyr His Ser Thr Asn Leu 
435 440 445 

Ser Glu Lys Thr Ala Glu Thr Val Tyr Gin He Gly Asp Gly Met Lys 
450 • 455 460 

Asp Ser Val He Asn Ser He Ala Ser Ser Leu Ser Lys His Asn Ala 
465 470 475 480 
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Pro Thr Glu Pro Asp Asn Ser Gly Ser Ala Ala Gly Lys Thr Val Val 
485 490 495 

lie Asp Pro Gly His Gly Gly Ser Asp Ser Gly Ala Thr Ser Gly Leu 
500 505 510 

Asn Gly Gly Ala Gin Glu Lys Lys Tyr Thr Leu Asn Thr Ala Leu Ala 
515 520 525 

Thr Thr Glu Tyr Leu Arg Ser Lys Gly lie Asn Val Val Met Thr Arg 
530 535 ~ 540 

Asp Thr Asp Lys Thr Met Ala Leu Gly Glu Arg Thr Ala Leu Ser Asn 
545 550 555 560 

Thr lie Lys Pro Asp Leu Phe Thr Ser lie His Tyr Asn Ala Ser Asn 
565 570 575 

Gly Ser Gly Asn Gly Val Glu lie Tyr Tyr Lys Val Lys Asp Lys Asn 
580 585 590 

Gly Gly Thr Thr Lys Thr Ala Ala Ser Asn lie Leu Lys Arg lie Leu 
595 ■ 600 605 

Glu Lys Phe Asn Met Lys Asn Arg Gly He Lys Thr Arg Thr Leu Asp 
610 615 620 

Asn Gly Lys Asp Tyr Leu Tyr Val Leu Arg Asn Asn Asn Tyr Pro Ala 
625 630 635 . 640 

He Leu Val Glu Cys Ala Phe He Asp Asn Lys Ser Asp Met Asp Lys 
645 650 655 

Leu Asn Thr Ala Glu Lys Val Lys Thr Met Gly Thr Gin He Gly He 
660 665 670 

Gly He Glu Asp Thr Val Lys 
675 



<210> 28 
<211> 351 
<212> PRT 

<213> Clostridium difficile 
<400> 28 

Met He Lys Lys He Ser Thr He Leu Ser Leu Val Leu Leu He Ser 
15 10 15 

He Ser Ser Thr He Gly Val Phe Ala Asp Ala Asn Pro Lys Arg Glu 
20 25 30 

Leu He Glu Gly Ser He Pro Glu He Ser Thr Glu Leu Asn Lys Arg 
35 40 45 

Ala Phe Lys Asp Ser Lys Glu Val He Leu Val Asn Glu Glu Ser He 
50 55 60 

Val Asp Ser He Ser Ala Thr Pro Leu Ala Tyr Ser Lys Asn Ala Pro 
65 70 75 80 
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lie Val Val Thr Lys Ser Lys Asn Leu Gly Arg Val Thr Arg Asn Tyr 
85 90 95 

Leu Lys Glu Leu Gly Pro Glu Lys Val Thr He Val Gly Gly Leu Lys 
100 105 110 

Ala Val Ser Lys Asp Ala Glu Arg Asn He Glu Lys Met Gly Met Lys 
115 120 125 

Val Glu Arg He Arg Gly Lys Asp Arg Tyr Asp Thr Ser Leu Lys He 
130 135 140 

Ala Arg Glu Met Tyr Arg Thr Val Gly Phe Asp Glu Ala Phe Leu Leu 
145 150 155 160 

Ser Ser Thr Thr Gly Leu Glu Asn Ala He Ser Val Tyr Ser Tyr Ala 
165 170 175 

Ala Lys Ser Gly Met Pro He He Trp Ala Lys Asp Glu Gly Phe Glu 
180 185 190 

Glu Gin He Asp Phe Leu Lys Gly Lys Asn Leu Lys Lys He Tyr Ala 
195 200 205 

Leu Gly Asp Ser Lys Glu Phe He Ala Glu He Asp Ser Asn Leu Lys 
210 215 220 

Asn He Glu Gly He Lys Gin He Asn Lys Ser Ser Thr Asn . Val Asp 
225 230 235 240 

Leu He Lys Lys Phe Tyr Asp Glu Lys Asp He Lys Lys He Tyr Thr 
245 250 255 

Ala Arg Leu Asp Phe Gly Ser Arg Ser Asp Val Asn Glu Tyr He Ser 
260 265 270 

Leu Gly Val Val Ser Ala Lys Glu Asn Met Pro He Leu He Cys Ser 
275 280 285 

Asp Asn Leu Ser Arg Ala Gin Asp Lys Phe Leu Lys Asp Ser Asn He 
290 295 300 

Asn Asp Val Val Glu Val Gly Tyr Thr Val Gly Asp Tyr Ser Leu Phe 
305 ^ 310 " 315 320 

Lys Ser He Phe Asn Leu Thr Phe Leu Ser Cys He Val Leu lie Leu 
325 330 335 

Leu Leu Leu Leu He Thr Phe Arg Ala Leu Arg Tyr Glu Ser Lys 
340 345 350 



<210> 29 
<211> 631 
<212> PRT 

<213> Clostridium difficile 
<400> 29 

Met Leu Ser Asn Lys Lys Arg Ser Met Ala He Val Met Ala Gly Ala 
15 10 15 
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Thr Val Met Ser Ala Ala Ala Pro He Phe Ala Asp Asn Thr Val Thr 
20 25 30 

Glu Asn Val Asp Lys Asn Tyr Thr Val Ser Ala Lys Asp Ser Ala Lys 
35 40 45 

Leu He Glu Glu Val Arg Lys Ala Leu Glu Val Lys Phe Glu Asp Thr 
50 55 60 

Lys Ala Gly Ala Asn Val Asn Asp Arg Val Tyr Asp He Lys Val Asp 
65 70 ' 75 80 

Asn Val Asn Leu Thr Asn Ala Thr Gin Leu Gin Asn Lys He Asn Ser 
85 90 95 

Leu Thr Glu Gly Gin Ser Leu Lys Val Thr He Gin Asp Lys Gly His 
100 105 110 

Gin Val Leu Gly Gly Lys Val Val Asp Tyr Lys He Glu Asn Tyr Lys 
115 120 125 

Thr Ala Gin Glu He Val Asp Ala Val Asn Ala Tyr Asn Ala Thr Leu 
130 135 140 

Ala Glu Asp Ser Asp Asn Lys Leu Thr Ala Thr He Lys Ser Thr Asn 
145 150 155 160 

Thr Val Glu Val Lys Arg Ala Lys Asp Ser Ala Asn Val He Thr Leu 
165 170 175 

Asn Val Gly Asp Gin His Leu Asp Phe Ser Lys Val He Thr Ser Glu 
. 180 185 190 

Glu Gly Thr Phe Glu Gly Tyr Glu Lys Arg Tyr Ser Asp He Asp Ser 
195 200 205 

Lys Glu Leu His Thr Val Thr Val Lys Asn Ala Asp Leu Gin Asp He 
210 215 ~ 220 

Ser Ala Glu Glu Leu Phe Asp Gly He Arg Leu Thr Thr Leu Gly Arg 
225 230 235 240 

Glu He Val Asn Lys Val Lys Asn Gly Tyr Ala Leu Thr Phe Glu Asn 
245 250 255 

Glu Ala He Leu Thr Gin Glu Gin Glu Asp Ser Asp Asp Lys Asp Lys 
260 265 270 

Pro Glu Lys Ser Ser Phe Asp He Val Leu Ser Lys Ala Asn Glu Lys 
275 280 ~ 285 

Pro Glu Thr He Ser Val Ser Ser Lys Asn His Lys Leu Val Arg Asp 
290 295 300 

Leu His Lys Val Leu Thr Asp Val Lys Asp Gly Lys Glu Leu Lys Val 
305 310 315 " 320 

Glu Val Leu Ser Gly Asp Ser Arg Phe Thr Thr Ala Val Glu Val Ser 
325 330 335 
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Lys Glu Arg Phe Lys Asp Gly Glu Ala Glu Ala He He Leu Val Gly 
340 345 350 

Glu Asp Ala He Val Asp Gly Leu Ala Ser Ala Pro Leu Ala Ser Gin 
355 360 365 

Lys Asn Ala Pro He Leu Leu Ser Lys Lys Asp Ser Leu Pro Ser Glu 
370 375 380 

He Glu Ala Glu He Leu Arg Val Leu Gly Ser Asn Leu Ser Ser Lys 
385 390 395 400 

Lys He Tyr lie Val Gly Gly Glu Ser Lys Val Ser Lys Glu Thr Glu 
405 410 415 

Glu Lys Leu Ser Lys Leu Gly Val Ser Lys Val Glu Arg Val Ser Gly 
420 425 430 

Glu Asp Arg Phe Glu Thr Ser Leu Glu He Ala Lys Gin Leu Lys Asp 
435 440 445 

Thr Phe Lys Thr Ala Phe Val Val Gly Gly Asn Gly Glu Ala Asp Ala 
450 455 ' 460 

Met Ser He Ser Ala Arg Ala Ala Gin Phe Gly Ala Pro lie He Val 
465 470 475 480 

Thr Gly Asn Glu Leu Asp Ala Asn Ala. Glu Lys Leu Leu Lys- Gly Lys. 

485 490 495 

Glu Leu Glu He Val Gly Gly Glu Asn Ser Val Ser Lys Glu Val Glu 
500 505 510 

Asp Lys Leu Val Asp He Asp Leu Asn Asn Lys Val Glu Arg Leu Ala 
515 520 525 

Gly Glu Asn Arg Lys Asp Thr Asn Ala Lys Val He Asn Lys Tyr Tyr 
530 535 540 

Ala Gly Ala Thr Lys Ala Tyr Val Ala Lys Asp Gly Tyr Val Gly Gly 
545 550 555 560 

Asn Gly Gin Leu Val Asp Ala Leu Thr Ala Ala Pro Leu Ala Ala Ser 
565 570 575 

Ser Lys Ala Pro He Val Leu Thr Thr Glu Glu Leu Ser Lys Ser Gin 
580 585 590 

Glu Glu Val Val Glu Leu Arg Leu Lys Asn Ala Thr Lys Leu Val Gin 
595 600 605 

He Gly Glu Gly He Ala Lys Asn Ala He Glu Lys He Ala Glu Lys 
610 615 620 

He Asn Leu Phe Thr Lys Asn 
625 630 



<210> 30 
<211> 477 
<212> PRT 
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<213> Clostridium difficile 
<400> 30 

Met Lys Ser Thr Leu Gly Val Glu Asn Asn Met Lys Asn Ser Lys Lys 
15 10 15 

lie . Leu Ala lie Gly Leu Thr Leu Phe Leu Val Met Val Asn Thr Pro 
20 25 30 

Met Val Ser Ala Leu Thr Ser Val Glu Gin lie Lys Gly Asn Asp Arg 
35 40 45 

Tyr Glu Thr Ala Ala Lys lie Ala Asp Lys Gin Asn Tyr Asn Thr Ala 
50 55 60 

lie Leu lie Asn Ser Asp Asn Ser Leu Ala Asp Gly Leu Ser Ala Ser 
65 70 75 80 

Gly Leu Ala Gly Ala Leu Asn Ala Pro lie Leu Met Thr Lys Gin Asn 
85 90 95 

Gin lie Pro Asn Thr Thr Met Glu Arg Leu Asn Lys Ala Lys Thr Val 
100 105 110 

Tyr lie lie Gly Ser Glu Ser Thr lie Ser Lys Asn. Val Glu Asn Gin 
115 120 125 

Leu Leu Ser Lys Lys Lys Val Val Gin Arg lie Phe Gly Glu Asn Arg 
130 135 140 

Phe Asp Thr Ser lie Lys lie Ala Glu Lys lie Lys Glu lie Lys Pro 
145 150 155 160 

lie Asp Lys Val lie He Ala Asn Gly Phe Thr Gly Glu Ala Asp Ala 
165 170 175 

He Ser Ala Ser Pro Val Ala Ala Arg Asp Gly Val Pro He He Leu 
180 185 190 

Thr Asp Gly Asn Ser Val Gly Phe Asp Thr Thr Gly Leu Lys Ser Tyr 
195 200 205 

Ala Leu Gly Ser Ser Glu He He Ser Asp Glu Leu Val Lys Ser Thr 
210 215 220 

Asn Ser He Arg Leu Gly Gly Thr Asp Arg Phe Glu Thr Asn Lys He 
225 230 * 235 240 

Val He Gin Glu Phe Tyr Lys Asn Ser Lys Glu Phe Tyr Leu Ser Lys 
245 250 255 

Gly Leu Gin Leu Thr Asp Ala Leu Ala Ala Ser Thr He Ala Lys Asn 
260 265 270 

Ala Pro Val Val Leu Val Glu Asn Gly Ser Asn Lys Ser He Leu Ser 
275 280 285 



Gly Ala Asp Lys Leu Thr Val Leu Gly Gly He Asn Gin Asn Val He 
290 295 300 
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Lys Gin Cys lie Asn Gin Ala Ser Pro Asn Gin Gin Gly Leu Tyr Tyr 

305 310 315 320 

Asn Pro Asn Asp Arg Ala Phe Lys Glu Arg lie Lys Gly Lys Val Tyr 

325 330 335 

Ala Leu Thr Lys Gin Tyr Arg Lys Glu Asn Gly Val Arg Ala Leu Ser 

340 345 350 

Val Ala Ser Arg Leu Glu Gly Leu Ala Asn Asp Trp Ser Asn Leu Met 

355 360 365 

Ala Asn Lys Lys Thr Leu Ser His Thr He Asn Gly Lys Asn Ser Tyr 

370 375 380 

Ser Thr Phe Leu Lys Tyr Leu Asp Trp Ser Glu He Lys Pro Gly Tyr 

385 390 & 395 400 

He Ala Val Gin Gly Glu Asn He He Lys Tyr Lys He Pro Asp Lys 

405 410 415 

Pro Val Tyr Thr Asn Arg Asp Ala Asp Asp He Gly Asn Phe lie Phe 

420 425 . 430 

Asn Glu Trp Lys Thr Asn Pro Glu Glu Gly Thr Asn Met Leu His Lys 

435 440 445 

Gly Tyr Glu He Met Gly Phe Gly He Ala He Thr Gly Asp Lys Asn 

450 455 460 

Leu Tyr Ala Thr His Glu Phe Tyr Gly Arg Tyr Lys Glu 

465 470 475 



<210> 31 ' 
<211> 626 
<212> PRT 

<213> Clostridium difficile 
<400> 31 

Met Asn Lys Arg Lys Ser Phe He Arg Thr He Ala Val Ser Thr Met 
15 10 15 

Ala Val Ala Val Thr Gly Ser Ala Thr Cys Ala Tyr Ala Ala Pro Val 
20 25 30 

Leu Gin Gly Thr Lys Thr Tyr .Glu Lys Val Asn Thr He Asp He Ser 
35 40 45 

Val Asp Ser Val Glu Asn He Val Tyr Ser Phe Gin Ala Ser He Lys 
50 55 60 

Val Gin Gly Glu Val Glu Val Val Asp Asn Glu Gin Lys Glu Lys He 
65 70 75 80 

Thr Trp Ser Asp Asn He Lys Ser Gin He Lys Ser Gly Asn Ala Asn 
85 90 95 

Ala Thr Cys Arg Ala Glu Tyr Asn Lys Ser Ser Asn Thr Thr Thr Leu 
100 "* 105 110 
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Asp He Tyr Val Thr Ser Asn Glu Asp Leu Leu Asp Gly Asn Arg Leu 
115 120 125 

Asn He Gly Arg He Ser Val Lys Lys Ser Gly Ser Asn Ser Asn Ala 
130 135 140 

Asp Tyr Lys Val Leu Gly Lys Gly Thr Ser Asp Lys Pro Ala Leu Lys 
145 150 • 155 160 

He Val Thr Tyr Asn Asn Lys Thr Val Asp Tyr Glu Asn He Ser Ser 
165 170 175 

Asp Glu Gly Leu lie Phe Thr Leu He Asn Glu Ser Glu Val Lys Pro 
180 185 190 

He Gly Gly Thr Gly Ser Ser Lys Asn Asp Pro Glu Lys Tyr Lys Val 
195 200 205 

Glu Lys Ser Glu Ala Leu Glu Tyr Leu Leu Asn Asn He Arg He Asn 
210 215 220 

Tyr Ser He Val Ser Lys Glu Thr Gin Glu Ser Gly Ser Asn Val He 
225 230 235 240 

Leu Lys Leu Gly Leu Ala Gin Lys Thr Thr Lys "Gly Arg Lys Ala Thr 
245 250 255 

He Asn Lys Tyr Val Glu Val Thr Leu Pro Lys Ser Leu Glu Tyr He 
260 265 270 

Val Glu Asn Glu Leu Ser Lys Pro Asp Glu Leu Pro Pro Asp Asn Gly 
275 280 285 

Ser Gly Gly Asn Asn Gly Gly Gly Ser Asn Ser Gly Gly Ser Ser Ser 
290 295 300 

Gly Gly Ser Ser Gly Gly Gly Asn Ser Ser Asp Ser Thr Ser Asn Val 
305 310 315 320 

Thr Val Lys Lys Leu Lys Gly Ala Asp Arg Phe Glu Thr Ala He Lys 
325 330 335 

He Ser Gin Ser Gly Trp Thr Lys Ser Asp Thr Val Val He Val Asn 
340 345 350 

Gly Glu Asp Lys Ser Met Val Asp Gly Leu Thr Ala Thr Pro Leu Ala 
355 360 365 

Ser Val Lys Asn Ser Pro He Leu Leu Ser Ser Asn Glu Lys Leu Pro 
370 375 380 

Gin Lys Thr Val Glu Glu Leu Lys Arg Leu Asn Pro Ser Lys Val Val 
385 * 390 ~ 395 400 

Val He Gly Gly Asn Asn Ser Met Pro Asn Ser Val Val Glu Ala He 
405 410 415 

Lys Ala Val Asn Ser Lys He Ser Val Gin Arg He Gly Gly Asp Thr 
420 425 430 
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Arg Tyr Gin Thr Ser lie Asn lie Ala Lys Glu lie Asp Arg Thr Asn 
435 440 445 

Asn Val Ser Lys Leu Tyr lie Gly Ala Gly Asn Gly Glu Ala Asp Ser 
450 455 460 

Leu Ser He Ala Ser Leu Ala Gly Lys Glu Lys Thr Pro He Val Leu 
465 470 ** 475 480 

Thr Gin Lys Asp Gly Val Asp Asn Glu Ala Glu Gin Phe He Lys Ser 
485 490 495 

Asn Lys Val Ser Asn He Tyr Phe He Gly Gly Val Glu Lys He Ser 
500 505 510 

Asn Lys Ala He Glu Gin Val Gly Lys He Ala Asn Lys Asp He Ser 
515 520 525 

Asn Asn Arg Val Ala Gly Gin Thr Arg Gin Glu Thr Asn Ala Lys Val 
530 535 540 

He Asp Lys Phe Tyr Ser Gin Ser Lys Leu Asp Gly Val Val Val Ala 
545 550 555 560 

Asn Gin Asp Lys Leu He Asp Ala Leu Ala Val Gly Pro Leu Ala Ala 
565 570 575 

Lys Asn Asn Ser Pro Val He Leu Ala Thr Asn Thr Leu Asp Lys Ser 
580 585 590 

Gin Glu Ser Ser Leu Lys Gly Lys Asn Ser Ser Lys Leu Phe Glu Val 
595 600 605 

Gly Gly Gly He Ala Ser Ser Val He Asp Lys He Lys Ser Leu He 
610 615 620 

Glu Lys 
625 



<210> 32 
<211> 550 
<212> PRT 

<213> Clostridium difficile 
<400> 32 

Met Glu Asn Asn His Asn He Asn He Lys Tyr Lys Asn His Gin Gly 
1 5 10 15 

Asp Met Lys Met Asn Lys Lys He Leu Ser Leu Gly Leu Ala Val Ser 
20 25 30 

Leu He Leu Val Asn Phe Lys Ser Val Asn Ala Ser Ser Val Val Glu 
35 40 45 

Lys He Tyr Gly Lys Asp Arg Tyr Glu Thr Ala Ala Lys He Ala Asp 
50 55 60 

Lys Gin Thr Tyr Glu Thr Val He Leu Val Asn Thr Glu Lys Ser Leu 
65 70 75 " 80 
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Ala Asp Gly Leu Ser Val Ser Gly Leu Ser Gly Ala Thr Lys Ala Pro 
85 * 90 95 

He Leu Phe Thr Gin Gin Asn Lys He Pro Ala Asp Thr Asn Arg Cys 
100 105 110 

Leu Lys Asn He Lys Lys Ala Tyr He He Gly Thr Glu Asp Thr He 
115 120 125 

Ser Lys Ser Val Glu Lys Glu Leu Asp Ser Lys Asn He Glu Val Lys 
130 135 140 

Arg He Gly Gly Glu Asp Arg Leu Lys Thr Ser Tyr Leu He Ala Lys 
145 150 155 ISO 

Glu He Ala Thr He Lys Lys Val Asp Lys Val Leu Leu Thr Asn Ala 
165 170 175 

Tyr Ser Gly Glu Ala Asp Ala Met Ser Val Ser Ser Val Ala Thr Arg 
180 185 190 

Asp Gly Ala Pro He He Leu Thr Asp Gly Lys Ser Val Pro Phe Asp 
195 200 205 

Val Lys Asn He Gin Ser Tyr Cys He Gly Ser Glu Glu He Met Ser 
210 215 220 

Asn Pro Leu Val Lys Asn Thr Asn Ser Val Arg He Glu Gly Thr Asp 
225 230 235 240 

Arg Phe Glu Thr Asn Lys Asn Val He Asp Tyr Phe Phe Asn Ser Ala 
245 250 255 

Asp Gly Phe Tyr Val Ser Asp Gly Tyr Gin Leu Val Asp Ala He Ala 
260 265 270 

Ala Ala Pro Leu Thr Lys Asn Ser Pro Met Val Leu Val Asn Asp Gly 
275 280 285 

Ser Asp Lys Thr Val Leu Glu Gly Ala Lys Asn He Thr Ser Val Gly 
290 295 300 

Glu He Asn Glu Lys Val He Gin Gin Cys He Asn Ala Ser Lys Ser 
305 310 ^ 315 320 

Asn Gly Gin Pro Pro Thr He Thr Val Gly Ser Thr Glu Val Tyr Lys 
325 330 335 

Gly Glu Lys Phe Asp Thr Gly Lys Leu Asn He Val Ala Lys Asp Asn 
340 345 350 

Thr Gly Lys Val Leu Pro He Glu Val Asp Gly Phe He Asp Thr Asn 
355 360 365 

Arg Val Gly Thr Tyr He Leu Thr Leu Lys Ala Thr Asp Glu Trp Gly 
370 375 380 

Lys Ser Thr Gly Lys Arg Val Glu He Lys Val Leu Asp Asp Lys Ser 
385 390 395 400 
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His Asp Tyr Asn Ser Pro Glu Phe Lys Lys Met Val Ser Thr Glu Met 
4 05 410 415 

Tyr Asn Leu lie Asn Ser Tyr Arg Lys Glu Lys Gly Lys Glu Pro Leu 
420 425 430 

Val Val Ser Ser Arg Leu Glu Gly Met Ala Asn Ala Trp Ser Lys Tyr 
435 440 445 

Met Met Asp Lys Lys Val Phe Ala His Tyr He Asp Gly Lys Asn Ala 
450 455 460 

Pro Gin Val Phe Ser Glu Phe Gly Met Arg Ser Glu Glu Asn He Ala 
465 470 475 480 

Tyr He Tyr He Asp Ser Lys Asn Val Gin Thr Thr Gin Asp Ala Lys 
485 490 495 

Asp Leu Ala Lys Ala He Phe Glu Val Trp Lys Lys Ser Pro Glu Tyr 
500 505 510 

Asn Ala Asn Met Leu Ser Asp Glu Phe Tyr Ser Thr Gly Phe Gly Leu 
515 520 525 

Tyr He Leu Ser Asp Gly Gin Val His Ala Thr Gin Glu Phe Leu Asn 
530 535 540 

Gly Asn Glu Gly Ser Leu 
545 550 



<210> 33 
<211> 528 
<212> PRT 

<213> Clostridium difficile 
<400> 33 

Met Lys Val Asn Lys Arg Val Leu Ser He Gly Leu Ala He Ser Leu 
1 5 10 15 

He Met Ala Gly Ala Pro Asn He Asn Ala Leu Ser Ser He Glu Lys 
20 25 30 

He Gin Gly Lys Asp Arg Tyr Glu Thr Ala Ala Lys He Ala Gin Lys 
35 40 45 

Gin Thr Tyr Glu Asn Val Val Leu Val Asn Thr Asp Asn Thr Leu Ala 
50 55 60 

Asp Gly Leu Ser Ala Ser Gly Leu Ala Gly Thr Val Lys Ala Pro He 
65 70 75 80 

Leu Leu Ser Gin Arg Asn Ser He Pro Ser Asp Thr Glu Lys Met Leu 
85 90 95 

Lys Asp Val Lys Lys Val Tyr He He Gly Thr Glu Asp Ser He Gly 
100 105 * 110 

Lys Ser Val Glu Asn Glu Leu Lys Gin Lys Gly He Glu Val Lys Arg 
115 120 *" * 125 
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He Gly Gly Asn Asp Arg He Glu Thr Ser Tyr Leu He Ala Lys Glu 
130 . 135 140 

He Ala Ser He Lys Pro He Asp Lys Val Phe He Thr Asn Gly Tyr 
145 150 155 160 

Thr Gly Glu Ala Asp Ala Met Ser Ala Ser Ser Val Ala Ser Arg Asp 
165 170 175 

Gly Ala Pro He He Leu Thr Asn Gly Lys Asn Val Pro Phe Glu Lys 
180 185 190 

Lys Glu Gly Val Gin Cys Tyr Ala Leu Gly Ser Glu Glu He He Ser 
195 200 205 

Asn Asp Leu Val Lys Lys Thr Asn Ser Val Arg Leu Ala Gly Glu Asp 
210 215 220 

Arg Phe Glu Thr Asn Lys Lys Val He Lys His Phe Tyr Ser Ser Ala 
225 230 235 240 

Lys Glu Phe Tyr Leu Ser Lys Gly Tyr Gin Leu Val Asp Ala Val Ala 
245 250 255 

Gly Ser Ser He Ala Lys Asn Ala Pro He Val Leu Val Asp Gly Asn 
260 265 270 

Ser Asp Lys Ser Val Leu Arg Ser Ala Asp Lys He Thr Ala Leu Gly 
275 280 285 

Gly lie Asp Glu Lys Thr Leu Glu Gin Cys Leu Ser Ala Ser Ser Leu 
290 295 300 

Asp Ala Ser Ala Pro Thr He Thr Val Gly Asn Leu Asn He Tyr Gin 
305 310 315 320 

Gly Asp Lys Phe Asp He Ser Lys Leu Asn He Val Ala Lys Asp Ser 
325 330 335 

Asn Gly Asn Asp Leu Thr Pro Glu Leu He Gly Asn He Asn Thr Asp 
340 345 350 

Lys Val Gly Lys Tyr Lys Val Thr He Lys Ala Thr Asp He Gly Gly 
355 360 365 

Lys Thr Thr Ser He Ser Val Glu Val Asn Val Leu Glu Tyr Lys Thr 
370 375 380 

Asn Asp Met Asn Ser Ser Glu Phe Lys Arg Met Val Ser Ser Glu Met 
385 390 395 400 

Tyr Ser Leu Val Asn Ser Tyr Arg Lys Glu Lys Gly Lys Glu Pro Leu 
405 * *" 410 415 

Gin Val Ser Glu Asn Leu Gin Gly Met Ala Asn Tyr Trp Ser Lys Tyr 
420 425 430 

Met Ala Asp Lys Gly Glu Phe Ala His Val He Asn Gly Lys Asn Ala 
435 440 445 
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Ala Glu Val Phe Ser 
450 

Val Pro Leu Thr Thr 
465 

Val Ala Asn Val He 
485 

Glu Asn Met Leu Ser 
500 

He Leu Pro Asp Gly 
515 



Gly Gly He Arg Ser Glu 
455 

Lys Ser Thr Tyr Thr Thr 
470 475 

Phe Thr Val Trp Lys Lys 
490 

Ser Lys Phe Ala Tyr Thr 
505 

Gin Val Tyr Ala Thr Gin 
520 



Glu Asn He Ala Phe 
460 

Lys Asp Ala Arg Glu 
480 

Ser Asp Lys Tyr Asn 
495 

Gly Phe Gly Leu Tyr 
510 

Glu Phe Leu Asn Lys 
525 



<210> 34 

<211> 781 

<212> PRT 

<213> Clostridium difficile 

<400> 34 

Met Ser Val He Asp Ser He Leu Asp Lys Ala Asp Glu Gin Glu- He 
1 5 10 15 

Lys Lys Leu Asn Val He Val Asp Lys He Asp Ala Leu Glu Asp Ser 
20 25 30 

Met Lys Asn Leu Ser Tyr Glu Glu Leu Lys Asp Met Thr Ala He Phe 
35 40 45 

Lys Asn Arg Leu Lys Lys Gly Glu Thr Leu Asp Asp He Leu Pro Glu 
50 55 60 

Ala Phe Ala Val Val Arg Glu Val Ser Lys Arg Lys Leu Gly Met Arg 
65 70 75 80 

Gin Tyr Arg Val Gin Leu He Gly Gly He Val He His Gin Gly Lys 
85 90 95 

He Ala Glu Met Lys Thr Gly Glu Gly Lys Thr Leu Val Glu Val Ala 
100 105 110 

Pro Val Tyr Leu Asn Ala Leu Thr Gly Lys Gly Val His Val He Thr 
115 120 * 125 

Val Asn Asp Tyr Leu Ala Glu Arg Asp Lys Glu Leu Met Ser Pro Val 
130 135 140 

Tyr Glu Ser Leu Gly Met Thr Val Gly Val He He Ser Asn Gin Asp 
145 150 155 160 

Pro* Asn He Arg Lys Gin Gin Tyr Lys Cys Asp He Thr Tyr Gly Thr 
165 170 175 

Asn Ser Glu Phe Gly Phe Asp Tyr Leu Arg Asp Asn Met Val Pro Asp 
180 185 190 



Leu Ser His Lys 
195 



Val Gin Arg Glu Leu Asn Phe Ala He Val Asp Glu 
200 205 
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Val Asp Ser. He Leu lie Asp Glu Ala Arg Thr Pro Leu lie lie Ala 
210 215 220 

Gly Asp Gly Asp Glu Asp Leu Lys Leu Tyr Glu Leu Ala Asn Ser Phe 
225 230 235 240 

Val Lys Thr Val Lys Glu Glu Asp Phe Glu Leu Asp Arg Lys Asp Lys 
245 250 255 

Thr He Ala Leu Thr Ala Ser Gly He Ser Lys Ala Glu Ser Phe Phe 
260 265 270 

Gly He Thr Asn Leu Thr Asp He Lys Asn He Glu Leu Tyr His His 
275 280 285 

lie Asn Gin Ala Leu Arg Gly His Lys Leu Met Glu Lys Asp Val Asp 
290 295 300 

Tyr Val He Ser Asn Gly Glu Val Met He Val Asp Glu Phe Thr Gly 
305 310 315 320 

Arg Val Met Asp Gly Arg Arg Tyr Thr Asp Gly Leu His Gin Ala He 
325 330 335 

Glu Ala Lys Glu Gly Val Glu He Lys Asn Glu Ser Lys Thr Met Ala 
340 345 350 

Thr Val Thr Tyr Gin Asn Phe Phe Arg Leu Tyr Glu Lys Leu Ser Gly 
355 360 365 

Met Thr Gly Thr Ala Lys Thr Glu Glu Gly Glu Phe Glu Ser He Tyr 
370 375 380 

Lys Leu Asn Val Val Gin He Pro Thr Asn Arg Pro Val lie Arg Ala 
385 390 395 400 

Asp Leu His Asp Lys Val Phe Lys Thr Glu Glu Glu Lys Tyr Ser Ala 
405 410 415 

Val Val Glu Glu He He Arg He His Lys Thr Arg Gin Pro He Leu 
420 425 430 

Val Gly Thr Val Ser Val Glu Lys Ser Glu Lys Leu Ser Lys Met Leu 
435 440 445 

Lys Lys Gin Gly He Lys His Gin Val Leu Asn Ala Lys Gin His Asp 
450 455 460 

Lys Glu Ala Glu He He Ser Lys Ala Gly Lys Leu Asp Ala He Thr 
465 470 .475 480 

He Ala Thr Asn Met Ala Gly Arg Gly Thr Asp He Ser Leu Gly Ala 
485 " 490 495 

Gly Asp Lys Glu Glu Glu Gin Glu Val Lys Asp Leu Gly Gly Leu Tyr 
500 505 510 

Val He Gly Thr Glu Arg His Glu Ser Arg Arg He Asp Asn Gin Leu 
515 520 525 
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Arg Gly Arg Ser Gly Arg Gin Gly Asp Pro Gly Thr Ser Arg Phe Phe 
530 535 " 540 

Val Ser Leu Glu Asp Asp Val lie Lys Leu Tyr Gly Gly Lys Thr lie 
545 550 555 ^ 560 

Glu Lys Leu Met Lys Arg Thr Ser Ser Asn Glu Asn Thr Ala lie Glu 
565 570 575 

Ser Lys Ala Leu Thr Arg Ala lie Glu Arg Ala Gin Lys Gly Val Glu 
580 585 590 

Gly Lys Asn Phe Glu lie Arg Lys Asn Val Leu Lys Tyr Asp Asp Thr 
595 600 605 

lie Asn Glu Gin Arg Lys Val lie Tyr Asn Glu Arg Asn Lys Val Leu 
610 615 620 

Asn Asp Glu Asp lie Gin Glu Asp lie Gin Lys Met Val Lys Asp He 
625 630 635 640 

He Gin Glu Ala Gly Glu Thr Tyr Leu He Gly Arg Lys Arg Asp Tyr 
645 650 655 

Tyr Gly Tyr Phe Lys His Leu Tyr Ser Thr Phe Met Pro Ala Asp Thr 
660 * 665 670 

Leu Leu He Pro Gly Val Asp Lys Lys Ser Val Gin Glu He He Asp 
675 680 685 

Ser Thr Tyr Glu He Ser Lys Arg Val Tyr Asp Leu Lys Lys Met Met 
690 695 700 

Leu Gly lie Asp Lys Val Ala Glu Leu Glu Lys Thr Val Leu Leu Lys 
705 710 715 720 

Val Val Asp Gin Tyr Trp He Asp His He Asp Ala Met Glu Gin Leu 
725 730 735 

Lys Gin Tyr He Gly Leu Lys Ser Tyr Ala Gin Lys Asp Pro Phe Lys 
740 745 750 

Glu Tyr Ala Leu Glu Gly Tyr Asp Met Phe Glu Ala Leu Asn Lys Asn 
755 760 765 

He Arg Glu Ala Thr Val Gin Tyr Leu Tyr Lys Phe Asn 
770 775 780 



<210> 35 
<211> 48551 
<212> DNA 

<213> Clostridium difficile 
<400> 35 

aggtatgatt tatgttttaa taggagttat 
gtttttagta acaaaagaca agagtgctct 
aattaaggaa caacagaaca aagaagacga 
gaatagctta ttacaaatat atcttaagaa 
ctttatgtac gcagtaagat acgaattgat 
aggaagatta cacactccac atgggattat 



agctatagca ggactatttg ctataaaaaa 60 
taatataatg agtattcatg atgaagaagc 12 0 
agataaatct atttagacaa taaattatat 180 
ataatgagaa ataaatatga ttaaggagaa 240 
aaagacttgt aagcagagtg gtgcaaggtt 3 00 
tgaaacacca atattcatgc cagttggaac 3 60 
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tcaagcaact gtaaaatcta tgacaccaga 
acttagtaac acataccatt tatatatgag 
tggtcttcat aagtttatga attgggataa 
ggtatttagt ctagggcctt taagaaagat 
cttagatggt tcaaaacatt ttctaactcc 
gggttcagac ataatgatgg catttgatga 
tgtaaaaaat tctttagaaa gaacaacaag 
taataccgac aagcaagctt tatttggtat 
ggaacaatca gcaaaagaaa taactaatat 
tagtgttgga gaaccaaaac cacttatgta 
gccaaaggat aaaccaagat acctaatggg 
tgttataaga ggagtagaca tgtttgattg 
aactgcaatg actagccaag gtaaagtagt 
tactccactt gaccctgaat gtgattgtta 
aagacacctt ataaaagcaa atgaaatact 
tcacttttta ttaaatttaa tgaaacaaat 
tgattttaga aatgaatttt ttgcaaaata 
attataatca ttttaatata taatattcaa 
tatataaaca tttattatga tttacaaaag 
tgccacagca acaaataatt atgagtatag 
atttcctaat gattagacct caaaagaaaa 
gcctaagtgt aggagacaag gtcattacaa 
ttgaagatga tgttgtaata ttagaattag 
aatgggctat aggaactgtt aaatctaaaa 
atttaaaagc cttttctcga atatatatta 
ctaattatat ttttaaaggt ctaggatatg 
tgtataactt atttttgaca tttacagata 
catttataac tactatttct gcagctatag 
aaaaaggact tatgtatggg cttttagtag 
ctgtgttttf agcacaagaa aagtttgtat 
taatatcagc agcaggaggc attggaggcg 
attaaagtag ttttatgata taatcaatag 
ggaaaatata catgtaaaaa ctttatctca 
tggatgtgga gagtgtcaaa cttcttgtca 
aaatcaagag tgtgaaagat agaaaattat 
aagtgattct tttatcatta tgaggtatcc 
tattctattt atgtaaaatt aaatttgatt 
atcatataag agcattcatt ttaaagtagc 
atgtattaga ataattttta attttaaact 
tatattttaa tatatggaga tagatattaa 
aaagcaaaaa ttagtgcata gtagtatata 
atgattgttt agtagaaaat cactataaaa 
agagctaagt agggggaagg ggttcaagaa 
tgtaaacaca taaaatatat tgaaaaataa 
aatacgaata taaaaaataa aaaataatat 
aagagaagta tggcaatagt aatggcaggt 
ttcgcagaca atactgtgac agaaaatgtt 
tctgctaaac taatcgaaga ggttagaaaa 
gcaggagcaa atgtaaatga tagagtttac 
aatgcaactc aattacaaaa taaaataaac 
actattcaag ataaagggca tcaagtgtta 
aactacaaaa ctgctcaaga aatagtagat 
gaagattctg ataacaagtt aacagcaact 
agagcaaaag attcagcaaa tgttattaca 
tcaaaagtga taacatcaga agaaggtact 
atagattcaa aagaattaca tacggtaaca 
gcagaagaat tatttgatgg aattagatta 
gttaaaaatg gatatgcttt aacatttgaa 
gatagtgatg ataaagataa accagaaaaa 
aatgaaaagc cagaaacaat aagtgtttca 
cacaaagtat taactgatgt aaaagatgga 



agaattaaaa gagattggtt cacaaataat 420 
accaggtcat gaacttataa aaagagctgg 480 
accaatactt acagatagtg gagggttcca 540 
aaaagaagaa ggtgtagaat ttaggtcaca 600 " 
agaaaaagct atggaaatac aaaatgcatt 660 
gtgtgcacca tacccatcag atagagaata 720 
atggttaaaa agatgtaaag atgctcataa 780 
aatacaaggt gggatgtata aagacttaag 840 
tgatttacca ggatatgcta ttggtggact 900 
tgacgtgtta gagcatacaa ctccacttat 960 
agttggtagt ccagatgact tagtagaagg 102 0 
tgttctacct actcgtatcg ctagaaatgg 108 0 
agttagaaat gcaacttatg ctgaagactt 1140 
tgcttgtaaa aactattcaa gagcttatat 1200 
gggtgcaaga ttaataacta ctcataatct 1260 
aagacaagcg ataatggaag atagattact 1320 
tggatatgaa atataggttt tgcactatta 1380 
tagtactcaa agatagagaa agtaaattgt 1440 
agtataattt acaaaggagt gaaaaattaa 1500 
gtctatgggt agtagtaatc gcaatctttt 1560 
aagacaaaca attaaaagaa atgagaagta 1620 
taggtggaat agtagctaat gtagctaaag 1680 
gaccaaatag aactaaagtt ccatttgaaa 1740 
aagaagaaat agaagaagac taaattagta 1800 
atttgagagg aggttttttt atgggaaaat 1860 
catatataat aactttggct gttttattag 192 0 
ttggtggaga taatataact* atggtctcat 1980 
gagggtttta cacttcaaaa cacatgaaag 2 040 
gacttttgta catagtatgt atttttttga 2100 
ttgaagtagg gatgatttat aagttattgc 2160 
tattgggtgt caactttaaa tagagtttat 222 0 
gaaacaatta gaaatggagg cgttgaaaat 2280 
agctacttta- aagcaaagtg ctgcaaaagg 2340 
atcagcttgt* aaaacttctt gtacagttgc 2400 
tgatttagtt gtttgtaaga gggtatctca 2460 
tcttttattt tcaattatta tataaaaata 2520 
tcttatcgtt taattatgca aaaatatcta 2580 
tactcttata ttttattttt tatcaattac 2640 
ctaatttatg atgattttta taatatgtta 2700 
tttagtgata tatatttaag taataagagt 2760 
aaaatggagt ttgaatactt ttgttagtat 2 820 
taggatttct atctttgagc ataattatag 2880 
gatattatta gtgtttgtgt tagtaaatta 2940 
tattatatat aatattattt atgtgtggga 3 000 
tcggaggaca taaatatgtt aagcaacaaa 3 060 
gctacagtta tgagtgcagc agcaccaata 3120 
gataaaaatt acacagtaag tgcgaaagat 3180 
gcattagaag ttaagtttga agatacaaaa 3240 
gatataaaag tagacaatgt aaacttaact 3300 
tctttaacag aagggcaaag tttaaaagtt 3360 
ggtggaaaag tagttgacta taaaattgaa 3420 
gcagttaatg cttacaatgc aactttagca 348 0 
ataaaatcta caaatacagt tgaagtaaaa 3540 
ttaaatgtag gagaccaaca tttagatttc 3 600 
tttgaaggtt atgaaaagcg ttatagtgac 3660 
gttaaaaacg caga-tttaca agacatatca 3720 
actacattag gaagagaaat agttaataaa 378 0 
aatgaagcaa tccttactca agaacaagaa 3840 
tcaagctttg atatagtttt aagcaaggca 3900 
agtaaaaacc ataagttagt tagagactta 3 960 
aaagaattaa aagtagaagt tttatctggt 4020 
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gattcaagat ttacaacagc agtagaagtt agtaaagaga gattcaaaga tggagaagca 4080 
gaagctataa ttttagttgg agaagatgct atagttgatg gattagcatc agcaccactt 4140 
gcctctcaaa aaaatgcacc aatattatta tctaaaaaag attcactacc atcagaaata 42 00 
gaagctgaaa tattaagagt acttggaagt aatctatctt ctaagaaaat atatatagta 4260 
ggtggagaat ctaaagtatc aaaagaaact gaagaaaaac tttctaaatt aggtgtaagt 4320 
aaagttgaga gagtttctgg agaagataga tttgaaactt ctttagaaat agcaaaacaa 43 80 
ttaaaagata catttaagac tgcttttgta gtaggtggaa atggagaagc tgatgctatg 4440 
agtatatcag ctagagctgc tcaatttggt gctccaataa tagttacagg taacgaatta 450 0 
gatgcaaatg ctgaaaaatt attaaaagga aaagaattag aaatagtggg tggagaaaat 4560 
tctgtatcaa aagaagttga agacaaatta gttgatatag atttaaataa taaagttgaa 4620 
agattagctg gagaaaatag aaaagatact aatgctaaag taatcaataa atactatgca 4680 
ggtgcaacta aagcatatgt agcaaaagat ggttatgtag gtggaaatgg acaattagtt 4740 
gatgcactta cagcagcacc acttgcagct agttcaaaag ctccaatagt attaactaca 4800 
gaagaacttt ctaaatcaca agaagaagta gttgagttaa gacttaagaa tgctactaag 4860 
ttagtacaaa taggtgaagg aatagctaaa aacgctattg aaaaaatagc agaaaaaata 4920 
aacttattta ctaagaacta ataagatata ggaattaata ttaaaagttt atttagataa 4980 
tatatgatta tacaatgcgt taatatgtaa gacatagtgt agatattata aaatatacaa 5040 
ataacaagat aaaataataa aaataaggtg tctcatttag aattagattg agatacctta 5100 
ttttattttt tgattataat gttgagaaat tcatattgca aaaattatta taaaaagata 5160 
cttcttataa tgattaaaca tattactaac ataaattatt ttgaacttat aacttaatta 522 0 
aataaaataa attataatgt tttgaattta tagagaataa aaactttagg tatattgttt 528 0 
gagttgtatg gataactatg gtatcatata taaaatatgg taacaaaatt gtaacttttg 534 0 
attaaaaata agtatgagta ctaataaaaa atatgaaaag tacattgggg gtagaaaata 5400 
atatgaagaa tagtaaaaag atattagcta taggacttac actattttta gtaatggtaa 5460 
atactcctat ggtaagtgcg ttgacatcag ttgagcaaat aaaagggaat gacagatatg. : 552 0 
agacagcagc aaaaatagca gataagcaaa actacaatac agcaatacta atcaactcag 55 8 0 
ataatagctt agcagatggt ctaagtgcaa gtggtttagc tggagcctta /aatgcgccta 5640 ■ 
ttttgatgac aaaacaaaat cagattccaa acacaactat .-ggaaagatta aacaaagcaa 5700 
agactgtgta tataataggc tcagaatcaa caataagtaa aaatgttgag aatcagttac 57 6 0 
tatccaaaaa gaaggttgta caaagaatct ttggtgaaaa tagatttgat acaagtataa 582 0 
aaatagctga aaaaattaaa gaaattaagc caatagataa agtaattata gccaatggat 5 8 80 
ttacaggaga agcagatgcg ataagtgcat caccagtagc tgctagagat ■ ggagtaccta 5940 
taatacttac agatggaaat agtgtgggat t ttgacacgac aggcttaaag agttatgcgc 6000 
ttggttctag tgaaataata agcgacgaat tagttaagag cacaaattct-- attagattgg 6060 
gtggcacaga tagatttgaa actaataaga tagttataca agagttttat aaaaatagta 6120* 
aagaatttta tctaagtaaa ggcttacaat taacagacgc tttggctgcc tctacaatag 6180 
caaaaaatgc tcctgtagtt ttagttgaaa atggcagtaa taagagtatt ttaagtggtg 6240 
cagacaaatt gactgtacta ggaggcataa atcagaatgt tatcaagcaa tgtataaatc 63 00 
aagcttcacc aaatcaacaa ggtttatatt'. acaatccaaa tgatagagca tttaaagaga- 63 60 
gaataaaggg taaagtatat gctcttacaa agcagtatag aaaagagaat ggagtaagag 6420 
cgttatctgt agcaagcaga ttagagggtc tagcaaatga ttggtctaat ttaatggcaa 6480 
ataagaagac tttatctcat acgataaatg gaaaaaattc atattctact tttttgaaat 654 0 
acttagattg gagtgagatt aaacctggat atatagcagt tcaaggtgaa aatataatta 6600 
aatataaaat tccagataaa cctgtatata caaatagaga tgcagatgat ataggaaact 6660 
ttatttttaa tgagtggaag acaaatccag aagaaggaac taatatgttg cataaagggt 672 0 
atgaaataat gggatttggt atagctataa caggagataa gaatctatat gctacacatg 67 80 
aattttatgg aagatataaa gaataagaaa atatacacaa aattatgctt gtaaggaagc 6840 
tatataaata attaaattat aatggaaaat agcatttata ttaataaatt aatataaatg 6900 
ttattttttt ataatttaat gtaaaaaaat tgtattaaaa agtttaaata tagtataaaa 6960 
tgggtaagaa agtagtgaaa attaaatggc aaaattaggg aaatctaatg acgcaaaact 7020 
atagggatta aagcttttaa agccatatca gccagttgct aaaaagagta aagtcttttt 70 80 
attttataat taagattctc tttgagtgtc tttttttatt gtctgaaact tatagtattt 7140 
aaattcatga aaaattagat atttattaag ggaggattgt tatttcatga aaaaggcaat 7200 
atcttgtgta ctagcagtat ctatgtgtag tgcaccacta aacgtttttg cagaacctat 7260 
attagagggg aagctaaggg cagttgaaaa atcagttaca gaaaagttaa gggggagtct 7320 
tgaagtagat ttaaattttt cactgccaat aaaaaatagc gagtcagaaa gtatgactaa 73 80 
tatgcagtta cgattaaaag atgatagtaa caatagtgga ataataaagc ttggagaaaa 7440 
agattcagga acagtaaatg ttgggacaga taatatagaa tatactataa aaaaattatc 7500 
tgcaagtaga agtgaaataa aaaataaaga tgaaaatgta tcatattata atattgtatt 7560 
taataattta cctgttggaa aatacaatgt agaagtctct ggagctggat ttaagaacaa 762 0 
agaaattaaa gacattgata tatcaagtta ttcacaaaga gtcttattaa gcaatagagc 7680 
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aagtgtagat aataaaaata aacctttaaa ttatgaacta ttcttaatgg gtgatgtaaa 7740 
tggagacggt aaagtagata agagtgatta taataaggtt ttagagaata tagattcaaa 780 0 
taaaagagaa tttgatttaa atagagatgg aaaggtagat attgttgact tagattatgt 7860 
acagaaaaat ttaggacaaa gtgaagatgc aaaaagccta gaaagtatag tatctaccaa 7920' 
tcctatagta gatacaagta aggttgaatt aaaaggaaat agtgatgtag atattaatgg 798 0 
taatgtagaa aatttattca gtggagaaaa taatggagtt acaatatctt caaaaaatga 804 0 
tgaaatatct gaggaaaaac cagcagtaat ggatatagct ttatcagaag cacttaatat 8100 
ggaagtcatc aatataaaag ctccagaagg tactgctcca gcaaaaggaa tagttgaggt 8160 
tactgacaaa gatggtaata ttcaacagat tccttatgaa aacaaagaat atatgaaaaa 8220 
cttaaaagat gaggagcaac aagaggaagt gaataacgaa caagaagaag taaatcaaga 82 80 
agaagtaaat aataaagatg aaaatataaa ccagactgat aataactcaa ataataaaga 8340 
tgaaaataat gataaacaac aggaaaatag cgataaacaa gaagcaaact ctaataccaa 8400 
tcaagataaa aataatgata aagaaaattc atcaatggat aaacctaatt cagatgaaga 8460 
tactccaaaa gaaggggata atatcagtaa agatgataac tttacaacat ctacatatag 8520 
taggtcagca ttagaaacat atgcagatgt tcaaaatagt gatatagtaa taaatcttgg 85 80 
taaacaagta gcagtaaaaa agataactat aaaaataaca gctacaacag caaatgatag 8640 
aaatcttgca gaaatatcaa aagttgaatt tttaaacaat gtatataaag aaataccaaa 8700 
gccagagatg aacataccta aaataaataa aatattgact agcactgctg ttggaaatga 8760 
aaaagtatct ttttattgga ataatgaagt aaatgtatca gcatacgaac ttatactaga 882 0 
caaaattgat agtaagggaa aagtattatc tactaaaaaa ttacaaacca gtaaaaatac 88 80 
aatagaagtt agagatatag aagcatatgg actatataga gttagcatac aatctttaaa 8940 
tggagattgg tctagtggtt ataaagatgc aacaccagaa aacttagacg gagtaccaga 9000 
aaatgtagat tcaaactata aacctataca atttagccca gatagtatat tagaatttca 9060 
agttgttcca gacacaaaac cagaaccacc agaaggtata aatgttaaag gtaaatttaa 9120 
atctcttgat gtaagctgga aaaatcataa acaagctaaa gattttgact tgtactataa 9180 
agaaaaaggc tctaatgggt cttggataaa atacaatgat aagcttattc gtgggacaag 9240 
ttcaaagata gaaggtctta aggataatac aacatatgaa gttagaatga • cagctactaa 9300 
ccacttggga actagtggaa tgtctaaaac atatataggt acaactgaaa gcatggatgc 9360 
accaacaaca cctaactata gattaattaa tactcctaaa gagaatggtg gatttggaat 942 0 
accaacaaat cacattgtag atgttactta tccagcagga tatgacaaaa gtgaatatag 9480 
tgataaaaat ccatttgata aattcaatgt agtagatggg - gactttacaa cacattggac 9540 
attcccaaca tataatgcag ctgagggaac gaatagagga ccaataataa cttttgataa 9600 
atcttatact atggatactt ttatggctac acctcgtttt gatgtttctg ataagacaat 9660 
aaatgacttt gaggtacgat attgggatga aaatggtcaa ttacacaatc tgggggatgt . 9720 
tccttttgaa acaaagacag accctgtaaa taataataag tatgtaatgg ttagattaaa 9780 
agaaccaata actgctacca agatacaagt aagtgtttgt tcatcctatt catataagtc 9840 
taatatatca gaacttaagt tttatgagta tgacagtata gaagatgatg taagaaatct 9900 
atttatggat gacctacaag ttgaattaaa agaagatgta. actcaagaaa aaataactga 9960 
attaaaagat agattaaata ctccagataa agctagtgga. gaatatcatc catataagac 10020 
agttattgag agagaattaa aattagctca agacttatat aatgatagag ccacacttac 10080 
tgatgtaact actgtaaatc aagaaataaa taaattagga aaatataaag ataaaaacgg 10140 
acagtatact gtaaattcca ataatttagg tatgcaaaat gactggcagg cattaggagt 10200 
agctgctaga gcaggagatg aaatcacagt atatgttgga agtaaaagtg gaaaaactcc 10260 
taagttgata tatacacagt actatggaga atctggtgca tataagagtg gagaaataaa 1032 0 
tcttaacact ggaaaaaatg taattcaatt aagcaaactt catagtttag atatagaatg 1038 0 
tggaggaagt ttatacataa gatacgctga agatactcca agtggaggag atataaaagt 10440 
acgtgtagca ggagctacta agataccaca ccttaatcta aatggaacaa taaatgataa 10500 
gagtcctgcg ggagtttctg aatctaagaa aaagataaaa gcatatatag aagaattagc 10560 
tagatataaa gatgctgtaa aaggagaacc atattatcca ggaacaaatg ttggtaattc 1062 0 
atatggatat tcagaaaaaa ctggtgtgtt aaacactaca gatattgaaa gtgataaagt 1068 0 
aacattaaat gttcctgcaa cagctgtata tcaagggata tcaaaaggaa atgctgattt 10740 
aaatacacaa gtagatagat tatataatag tatgcttgca tgggaacaaa taatggattt 10800 
agtttactca gaaagaggag tatttaaaac acaagattta gatggtaatg gaactgtaga 10860 
tgataatgaa tttaacttaa ctaaaaatga tatggctcca aaatctcgta tgaacataaa 1092 0 
ataccaaaga atgtttattg gagcatttat gtatgcttct ggattacatg taggagtaga 10980 
atatggttca gttccagggc ttatgaatgg agtacctttc caaatagatg atagtggaaa 1104 0 
agctacagga ggaaacttat ttggatgggg aataggtcat gagataggac atgttacaga 11100 
tattggaaag atgacatata gcgaaactag taacaatgta cttgctttac ttgctcagac 11160 
ctttgatgat aagacacatt caagacttga aggttcaaca atggataaga tatatgagca 11220 
tgtaacttca aatagcttag gtataccaag caatgtattt gaaaggcttg gaatgttatg 112 80 
gcaattacat ctagcttatg atgatgattt tacgggttca atgttaaaga ataactcaga 1134 0 
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tgctgattta agtaatgata cattctatgc aaagatttca agaaaataca gagcactttc 11400 
aagtagtgac ccaattaata gtttgccaaa agaccaaatg ctagtagcaa tggcatcaag 11460 
tgttgttgaa aaagatttaa gagattactt taaggcatgg ggagtagaaa ttacaccaga 11520 
attaaacagt ataatggata gtaagaacta cgaaaaagaa tcaagaaaga tacaatatgt 115 8"0 
aaatgacgaa gctagaagaa agattttaaa tgaaaatata gctagtatgg caaaagatac 11640 
taaagtaagt gctagttttt ctgatggaat acaaagtggt tctctagtta aatctaataa 117 00 
aatatctata gacctagatg tagataagga taaagataag atactaggat acgaaataat 11760 
aagaagtgat ggaaattatt tagatggtga aaatggaaca caggtaaaat acagaacagt 11820 
tggatttgtg gatgcaaaag gaaatagtaa gactaaattt actgataata tatctcctct 118 80 
taataacaga gcatttacat ataaggtagt tgcatatgat tatcacctaa atccaacaga 11940 
agaatttgaa gttggtactg taaaattatc agatgaaggg aaaataaata aatcagcatg 12000 
gtcattcata acaaatactg tttctgacgg tgatgttaga actgaaaatg actcacatgg 12060 
accattacaa aatcctgaaa ttgataatat aaaagataat gatgcaagta ctacatataa 1212 0 
aggaaagata ataagtaaag cagagtggaa caaagaccct caaaaagacc ctgatataaa 12180 
tgtagaagaa aatccatata taatagtaga catgaaagaa acattaccta tagtaggact 12240 
aaaatataca aaaccagagg cagaagctag gaagttttca ttaaaaggat tatttaactt 123 00 
tagaaagaat actagcacaa catacaatcc tttaactaat tacaagatac aagtaagtaa 123 60 
tgataaaaag aactgggaag atgtaagtca aggaactttt gaatatggac aagatttagt 12420 
tggtggagga aaagaccaag ataatgaagc tagagttaca tttaataaag atggtaaact 12480 
ttggacatac caagctagat atgtaaaact aatctcaaat aataagtcta atatagaagt 12540 
tgcagaatta gatataatag gacctccagg tgataatata gaaattggca caatgaataa 12600 
tgcagaccaa actcgtacta atggaattgg aaaattagca gaagattatg tatatcaaag 12660 
tgatgactct agcacaagtg aagatgaaac aggtaagatt ccagcaggtt caatagttat 12720 
tacaggtgaa tacagtggaa atccagcatt taatttacca ctacttatag ataaaaacaa 12780. 
taagactata tctggagaag cacttttatt tgcagaagtt ccagaaaaag- gtgaactagg 12840 
tgaaatcagt aaaggtacat gggtttacta tatgccaaaa gaaaactttg attctttaag 12900 
tgataaggtt aaggctgaac tatatagata caatgacctt caaggagata ctccagtagg -12960 • 
tcaaagattt gtaagtgata caatgtatgt acctgtagga gctaagaatt atgatgagtt 13 020 
aaagacaata agtcttactg atagtaatag taaagctaga- aaagctacaa gaaatgctat 13080 
tgatatcagt aataagaagg ttataactgt aagcaataaa gtaaagggta atattgttag 13140 
agataaatag taggtggaac tctaaaaacg taaaaatact aacactaaca ttgtatatct 13200 
aataagatta ggtactagaa tagttttaaa atagtagata ttaaaagtaa agtatataaa 13260 
tgtaactata agtttaaggt aatttataaa ttaaatataa aagctaggtg gataagtaaa 1332 0 
tttatccact tagttaatga tagataagga cgggtgttaa- tgaataaaag aaaatctttt 13380 - 
ataagaacta tagcagtatc aactatggca gtagctgtaa ctggtagtgc tacatgtgca 13440 
tatgcagctc cagttttaca aggaacaaaa acgtatgaga aagtaaatac tatagatata 13500- 
tctgtggact cagtagaaaa catagtatat tcatttcaag ccagtataaa agtacaaggt 13560 
gaggttgaag tagtagataa tgagcagaaa gaaaagatta cttggtctga taatataaaa 13 620 . 
tctcagataa aaagtggaaa tgcaaatgct acttgtagag cagaatacaa caaatcaagt 13 680*' 
aatacaacta cattagatat atacgttaca tcaaatgaag atttactaga cggaaataga 13 740 
ttgaatatag gaagaatatc tgtaaaaaaa tctggcagta actcaaatgc tgactataaa 13 800 
gttttaggaa aaggaacaag tgataaacca gctttaaaaa ttgtaacata taacaacaaa 13 860 
acagtggatt atgaaaatat aagctctgat gaaggattaa tatttaccct tattaatgag 13 920 
agtgaagtaa aaccaattgg tggaactggt tctagtaaaa atgaccctga aaagtataag 13 980 
gtagagaaaa gtgaagcact tgaatattta ttaaacaata ttaggataaa ctattctatt 14 040 
gttagtaaag aaacacaaga aagtggttca aatgtaatat taaaacttgg attagcacaa 14100 
aaaacaacta agggaagaaa agctacaata aacaaatatg ttgaagttac acttccaaaa 14160 
agtttagaat acatagttga aaatgaatta tcaaaaccgg atgaattacc accagataat 14220 
ggtagtggtg gaaataatgg tggaggaagt aattcaggcg gctcatctag tggtggaagc 142 80 
tctggaggag gaaattcaag tgattctaca tctaatgtca cagtcaaaaa attaaaaggt 14340 
gctgatagat ttgagacagc tatcaaaata tctcaaagtg gttggacaaa atcagataca 144 00 
gttgtaatag taaatggaga agataaaagt atggtggatg ggcttacagc tacaccactt 14460 
gctagtgtaa aaaactcacc gattctttta tcatcaaatg aaaagcttcc tcaaaaaact 14520 
gtagaagaat tgaaaaggct aaatccatca aaagtagttg tgataggtgg aaacaactca 145 80 
atgccaaatt cagttgttga agctatcaaa gctgtaaatt caaaaatatc agtacaaaga 14 640 
ataggtggag atacaagata tcaaacttct attaatatag caaaagaaat agataggacg 14700 
aataatgtga gcaagttgta tataggagca ggaaatggag aagctgattc attatcaata 14760 
gcatcactgg ctggtaaaga aaaaactcca atagtactta cacaaaaaga tggtgtagat 14820 
aatgaagctg aacaatttat taaatcaaat aaggtgtcaa atatatattt tattggtgga 14 880 
gtagaaaaaa tatctaacaa ggctattgaa caagttggaa aaatagcaaa caaagatata 14940 
tcaaacaata gggtggcagg tcaaactaga caagagacga atgccaaagt aatagataaa 15 000 
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ttctactctc aatctaagct agatggagtt gtagttgcaa atcaagacaa acttatagac 15060 
gctttagcag ttggaccatt agcagctaaa aacaattcac cagttatact tgctacaaat 1512 0 
acgttagata aatcacaaga atcgtcatta aaaggtaaaa attcatcaaa gttgtttgaa 15180 
gtaggtggag gaatagcctc atctgtaata gacaagatta agagtttaat tgaaaaataa 1524'0 
atattaagta aatagaaaat aagaatgtat ttagatataa agttatatat ttaaatgcat 15300 
tcttatttta ttttcattat cttaacaatc taataatatt gaggttaaaa tgtataataa 15360 
tttgagctat attaagttta attttaaata aagcagatat cttttatata tataaatata 1542 0 
aattcaattt tgaatatgtt taaaaagaag ttaaaaataa ttttttaaat ttaataaaat 1548 0 
aaaacactaa tatatagacg agatatattc atatcatata tactctataa tatatggaaa 1554 0 
ataaagagaa aataaatgga aaataatcac aatataaata taaaatataa aaatcatcag 15600 
ggggatatga aaatgaacaa aaaaatatta tcattaggtc tagcagtatc attaatctta 15660 
gtaaacttca aaagtgtaaa tgcatcatca gtagtagaaa aaatatacgg taaagataga 15720 
tatgaaacag cagcaaagat agctgataaa cagacttatg aaacagtaat tttagtaaat 15780 
acagaaaaat cacttgcaga tggattaagt gtaagtggat tatcaggagc tacaaaagct 15840 
ccaatactat ttacacaaca aaataagata ccagcagaca caaatagatg tctaaaaaat 15900 
atcaaaaaag catatataat aggtacagaa gatactataa gtaaatcagt agaaaaagaa 15960 
ctagattcta aaaatataga agtaaaaaga attggtggag aagatagact aaaaacaagc 1602 0 
tatttaatag ctaaagaaat agcaactata aaaaaagtgg ataaagtact attaactaat 16080 
gcatatagtg gagaagcaga tgcaatgagt gtatcatcag tagctactag agatggagct 1614 0 
ccaatcatac ttacagatgg aaagagtgtg ccttttgatg taaaaaatat tcaatcatat 16200 
tgtataggtt cagaggagat aatgagtaat cctctagtta aaaatacaaa ctcagtaaga 16260 
atagaaggaa ctgaccgatt tgaaactaac aaaaacgtaa tagactactt ttttaatagt 16320 
gcagatggat tctatgtatc agatggatac caattagtag atgcaatagc agctgcacca 163 8 0 
cttactaaaa actctccaat ggtactagta aatgatggaa gcgataaaac tgtattagaa, 1644 0 
ggagctaaga atataacctc tgtaggtgaa ataaatgaga aggtaataca acagtgtata 16500% 
aatgcttcaa agtcaaatgg acaaccccca acaattacag ttggaagtac agaggtatat 16560 
aagggtgaaa agtttgacac tggcaagtta aatatagtag ctaaagataa cacagggaag- 16620 • 
gtattaccaa tagaagttga tgggtttata. gatacaaata gagtaggtac atatatattg 16680 
acattaaaag ctacagacga atggggaaaa agtacaggaa aaagagtaga aataaaagtg 1674 0 • 
ttggatgaca aatctcatga ttacaacagc ccagaattta aaaaaatggt atctactgaa 16800 
atgtataact taatcaattc ctatagaaaa gaaaaaggca aagagccttt ggtagtgtct 16860 
agtaggctag aaggtatggc aaatgcatgg tctaagtata tgatggataa. aaaagtattt 1692 0 
gcacattata tagatggaaa aaatgctcca caagtatttt ctgagtttgg aatgagaagt 16980 
gaggaaaata tagcatatat ttatatagat tctaagaacg • ttcaaactac - acaagatgca 17040; 
aaagacttgg caaaggctat atttgaggtg tggaagaaat ctccagagta caatgcaaac • 17100 
atgcttagtg atgagtttta tagtactgga tttgggcttt atatattatc tgatggtcaa 17160 ' 
gtgcatgcta ctcaagagtt tttaaatggc aatgaaggta gcttgtaaaa aattccataa 1722 0 
atcttgtttt tttgttaatt tactgatatt ataaaatgaa aatcacatct aaagtacttt- 1728 0 
ttgtaatctg ttttgttatt gctttacagg ttatggtatt ttaggtgttt tttgttttag 1734 0 
tagtttataa attttgaggg aattactact gaggatgtaa tttatgatta gctgaattat 17400 
ataaaaaaat tagtaggagg agtttgatga aggtaaataa gagagtatta tcaataggat 17460 
tggcaatatc attaataatg gcaggagcac caaatatcaa tgcactttca agtatagaaa 17520 
agatacaagg gaaagataga tatgagacag ctgccaaaat tgcacagaaa caaacatatg 17580 
aaaatgtagt attagtaaat acagataata cactagctga tggattaagt gcaagtggat 1764 0 
tagcaggaac agttaaagca cctatattat tatcacaaag aaacagtata ccatcagaca 17700 
cagaaaaaat gttaaaagat gtaaagaagg tttatataat aggaactgaa gattcaatag 17760 
gaaaatcagt tgaaaatgaa ttaaagcaaa aagggataga agtcaaaaga ataggtggta 1782 0 
atgatagaat tgaaacaagt tatcttatag caaaagaaat agcttcaata aaacctatag 17880 
ataaggtttt tataactaat ggttatacag gtgaagcaga tgctatgagt gcgtcatcag 17940 
tagcatcaag agatggagca cctataattt taactaatgg taaaaatgtt ccatttgaga 18000 
aaaaagaagg tgtgcaatgt tatgctcttg gttcagagga gataataagt aatgatttag 18060 
ttaaaaaaac taattcagtt agattagctg gagaagatag atttgagact aataaaaaag 18120 
ttataaaaca tttttatagt tcagcaaaag aattttatct ttctaaaggt tatcaattag 18180 
tagatgcagt tgctggttct tctatagcta aaaatgcacc tattgtattg gttgatggaa 18240 
atagtgataa gtctgtactt agaagtgctg ataagattac agcactagga ggaatagatg 18300 
aaaagacgct agaacaatgt ttaagtgcat catcactgga tgcaagtgca ccaacaataa 18360 
ctgttggaaa tttaaatatt tatcaaggtg ataagtttga tataagtaaa ctgaatatcg 1842 0 
tagcaaagga tagcaatggt aatgatttga caccagagct tataggaaat attaatactg 18480 
ataaggttgg taaatacaaa gttactataa aagctactga tattggtgga aagactacct 18540 
caataagtgt agaagtaaat gtacttgaat ataagactaa tgatatgaat agtagtgagt 18600 
ttaaaagaat ggtttctagt gaaatgtata gtttggtaaa ttcatataga aaagaaaaag 18660 
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ggaaagagcc attacaagtg tctgaaaatt tacaaggaat ggcaaattat tggtctaagt 1872 0 

atatggctga taaaggtgaa tttgcacatg taataaatgg taaaaatgct gctgaagtat 18780 

tttctggtgg tattagaagt gaagaaaata tagcctttgt accacttact acaaagagta 18840 

cttatacaac aaaagatgct agagaagtag caaatgtcat atttacagtt tggaagaagt 18900 

ctgataagta taatgaaaat atgttgagtt ctaagtttgc atatactgga tttgggttat 18960 

atatattacc agatggacaa gtgtatgcta ctcaggagtt tttaaataag taaaattaaa 1902 0 

ttttttagtt tattacattt taaaatttag ggtataaaaa cttgtaaact tggagaaaat 19080 

aataatttaa aaaaatagct tgcaaaaaga ataaaaatgg attattatag agatgtgaga 19140 

aatattagga atatatggat gattattcta tgtacataat aaagagatgt aattttaata 19200 

taatgttggg aggaatttaa gaaatgaata agaaaaatat agcaatagct atgtcaggtt 19260 

taacagtttt agcttcggct gctcctgttt ttgctgcaac tactggaaca caaggttata 1932 0 

ctgtagttaa aaacgactgg aaaaaagcag taaaacaatt acaagatgga ctaaaagata 193 80 

atagtatagg aaagataact gtatctttta atgatggggt tgtgggtgaa gtagctccta 1944 0 

aaagtgctaa taagaaagcg gacagagatg ctgcagctga gaagttatat aatcttgtta 19500 

acactcaatt agataaatta ggtgatggag attatgttga tttttctgta gattataatt 19560 

tagaaaacaa aataataact aatcaagcag atgcagaagc aattgttaca aagttaaatt 19620 

cacttaatga gaaaactctt attgatatag caactaaaga tacttttgga atggttagta 19680 

aaacacaaga tagtgaaggt aaaaatgttg ctgcaacaaa ggcacttaaa gttaaagatg 19740 

ttgctacatt tggtttgaag tctggtggaa gcgaagatac tggatatgtt gttgaaatga 19800 

aagcaggagc tgtagaggat aagtatggta aagttggaga tagtacggca ggtattgcaa 19860 

taaatcttcc tagtactgga cttgaatatg caggtaaagg aacaacaatt gattttaata 1992 0 

aaactttaaa agttgatgta acaggtggtt caacacctag tgctgtagct gtaagtggtt 19980 

ttgtaactaa agatgatact gatttagcaa aatcaggtac tataaatgta agagttataa 20040 

atgcaaaaga agaatcaatt gatatagatg caagctcata tacatcagct gaaaatttag 2 0100 

ctaaaagata tgtatttgat ccagatgaaa tttctgaagc atataaggca atagtagcat 20160 

tacaaaatga tggtatagag tctaatttag ttcagttagt taatggaaaa tatcaagtga 20220 
ttttttatcc agaaggtaaa agattagaaa ctaaatcagc • aaatgataca atagctagtc- 2 02 80 

aagatacacc agctaaagta gttataaaag ctaataaatt aaaagattta aaagattatg 2 0340 

tagatgattt aaaaacatat aataatactt attcaaatgt tgtaacagta gcaggagaag 2 0400 

atagaataga aactgctata gaattaagta gtaaatatta taattctgat gataaaaatg 2 0460 

caataactga taaagcagtt aatgatatag tattagttgg atctacatct atagttgatg 20520 

gtcttgttgc atcaccatta gcttcagaaa aaacagctcc attattatta acttcaaaag -20580 

ataaattaga .ttcatcagta aaatctgaaa taaagagagt tatgaactta aagagtgaca 2 0640 

ctggtataaa tacttctaaa aaagtttatt tagctggtgg agttaattct atatctaaag 20700 

atgtagaaaa tgaattgaaa aacatgggtc ttaaagttac tagattatca ggagaagaca- 20760 

gatacgaaac ttctttagca atagctgatg aaataggtct tgataatgat aaagcatttg 20820 

tagttggtgg tactggatta gcagatgcta tgagtatagc tccagttgct tctcaactta 208 80 

aagatggaga tgctactcca atagtagttg tagatggaaa agcaaaagaa ataagtgatg- 2 0940 

atgctaagag tttcttagga acttctgatg ttgatataat aggtggaaaa aatagcgtat ^21000 

ctaaagagat tgaagagtca atagatagtg caactggaaa aactccagat agaataagtg 21060 

gagatgatag acaagcaact aatgctgaag ttttaaaaga agatgattat ttcacagatg 21120 

gtgaagttgt gaattacttt gttgcaaaag atggttctac taaagaagat caattagtag 21180 

atgccttagc agcagcacca atagcaggta gatttaagga gtctccagct ccaatcatac 21240 

tagctactga tactttatct tctgaccaaa atgtagctgt aagtaaagca gttcctaaag 213 00 

atggtggaac taacttagtt caagtaggta aaggtatagc ttcttcagtt ataaacaaaa 213 60 

tgaaagattt attagatatg taatataagt tttaataaaa ctttaaatag aaaaaggctt 21420 

ctctcatgag aagtcttttt tatttaaaat aaatataaaa taaaatagag gctataaata 21480 

gcctctattt tatgtgagaa atccctaaat aaaaagatgt ttaatttatt taagtaaaat 21540 

cattatacta taaatcaaaa taattgtcta ctattatgga tttttaatgt ataatagtaa 21600 

ttggacaata gaaaaggagg tacttatatg tcagttatag attcgatact tgacaaagca 21660 

gatgagcaag aaataaagaa attgaatgtt atagtagata aaatagatgc attagaagat 21720 

agtatgaaaa atctatcata tgaagaacta aaagacatga cagctatatt taaaaataga 2178 0 

cttaaaaaag gtgaaacctt agatgatata cttccagaag catttgctgt agtaagagaa 21840 

gtatcaaaaa gaaaattggg aatgcgtcaa tatagagttc aattaattgg tggaatagta 21900 

atacatcaag gaaagatagc agagatgaaa actggtgaag gtaaaactct agttgaagta 21960 

gcaccagtat atttaaatgc ccttacaggt aagggtgtac atgtaatcac agtaaatgat 22020 

tacctggcag aacgtgataa ggagcttatg agcccagttt atgaatctct tggaatgaca 22 0 80 

gtaggagtaa ttatatctaa ccaagaccca aatataagaa aacaacaata taaatgtgat 22140 

ataacttatg gtacaaatag tgaatttgga tttgattatt taagagataa tatggtgcca 22200 

gatttatctc ataaagtaca aagggaacta aattttgcta tagtagatga agttgactca 22260 

atattaatag atgaagccag aactccactt attattgcag gagatggtga tgaagattta 22320 



WO 01/94599 



PCT/SE01/01280 



38 



aaactttatg agttagcaaa tagctttgta 
gacagaaaag ataaaactat agcattaaca 
tttggtataa caaaccttac tgatataaag 
gctttaagag gtcataagct tatggaaaaa 
gtaatgatag ttgacgaatt tacaggaaga 
cttcaccaag ctatagaagc aaaagaaggt 
gctactgtga cttatcaaaa tttcttcaga 
actgcaaaga cagaagaagg ggaatttgag 
ccaactaata gaccagttat tagagctgat 
gaaaagtata gtgctgttgt agaagaaata 
cttgtgggaa cagtttctgt tgaaaaatct 
ggtattaagc atcaagtctt aaatgctaaa 
aaagccggta aattagatgc tataacgatt 
atttctctag gtgcaggaga taaagaagaa 
tatgttatag gaacagaaag acacgaatca 
tctggtcgtc aaggagaccc aggtacatca 
ataaagcttt atggtggaaa aactatagag 
aatacggcta ttgaaagtaa agcacttaca 
gaaggtaaaa attttgaaat aagaaaaaat 
caaagaaaag ttatatataa tgaaagaaat 
gatattcaaa aaatggttaa agacatcata 
agaaaaagag attattatgg atattttaaa 
acattactaa tacctggtgt agataaaaaa 
gaaatttcaa aaagagttta tgacttgaaa 
gagttagaaa aaacagtact tttaaaagtg 
gctatggaac agttaaaaca gtatataggt 
aaggaatatg ctttagaagg atatgacatg 
gcaacagtgc aatacttata taaatttaac 
aaaaataaag tattttgatt ctttatattg 
atgaaaaaga agctgaaaat cagcttcttt 
gcttgaaatt tctttcttcc atttaatatt 
ttataaactg gattattaaa aaaaattaca 
taaaaaaata aaatataaga aaaagtttat- 
atttattgaa ataatatcaa atatatatta 
atttgaattt tttaggggga aaataccatg 
gctgcaatga taagtacatc agtagctcca 
aaagaaacaa taactaagaa agaagctaca 
tctcaaaagt atacaggagg ttctcaagtt 
gagactttat caaaattaaa aataataact 
gctttgggag aaaataaaga acttattgta 
agtgcaaatg aagtagttgc agaagcaact 
gctgaagcta actctataac agaaaaagct 
gcagatgtaa aagcttcata tgatagtgct 
aaaacagaaa cagtaacttc taaaactata 
ttaacagcaa atccagttga ttcaactgga 
agagtaaata aaatcgataa actaggtgta 
ttggctgaaa taactataaa aaatagtgac 
gatggatata gattaactgt taaaggtaat 
agtgatattt cagcaaaaga ttcagaaaca 
gatgcatctg gaaaagcaac agagcttact 
gatgccaaag ctgcattaga aggtaattca 
tatgcaactg cagtggctat agcaaaacaa 
aattcaaata aactagttga tggattagca 
cctatattat tagcatctga taatgaaata 
ataattaaga aaagcccatc agctaaaata 
aatacagcta aaaagcaatt agaatcagta 
gatagacata cgacttctgt agcagtagca 
gtagtaggtg cgaaagggga ggctgatgct 
aaggctccta taatagtaaa tggttggaat 
gatggaaaag agattggtat agttggtggt 
caacttgccg atattgataa agatagaaaa 



aaaactgtta aagaagaaga ctttgagctg 22380 
gcaagtggta taagtaaagc tgagtcattt 22440 
aacatagaat tatatcatca tataaatcaa 225 00 
gatgttgact atgttatttc aaatggagaa 2256*0 
gtaatggatg gtagaagata tacagatgga 22 620 
gttgagataa agaatgaatc taaaactatg 22 680 
ctatatgaaa aactttctgg tatgactggt 22740 
tcaatctata aattaaatgt tgtccaaata 22800 
ttacatgata aggtatttaa aacagaagaa 22 8 60 
ataaggatac ataagactag acagccaata 22920 
gagaagctgt ctaaaatgct taaaaaacaa 22 980 
caacatgata aagaggcaga gataatttct 23 040 
gctacaaata tggcaggtag aggaacagat 23100 
gaacaagaag ttaaagattt aggtggactt 23160 
agaagaattg ataatcagct tagagggcgt 23220 
agattctttg taagtcttga agatgatgta 232 80 
aaacttatga agagaacaag ttcaaacgaa 23340 
agagctatag aaagagcgca aaaaggtgta 234 00 
gttcttaaat atgatgatac tattaatgag 23460 
aaagtgttaa atgatgaaga tatacaagaa 23520 
caagaagcag gagaaactta tttaattgga 23 5 80 
catttatata gtacatttat gccagcagat 23 640 
agtgtccaag agataattga tagcacatat 23700 
aagatgatgc ttggtattga taaggttgca .23760 
gttgatcaat actggataga tcatatagat -23 820 
cttaaatctt atgctcaaaa agacccattt 23 8 80 
tttgaagctt taaataaaaa tataagggaa 23940 
taaaaaagtt catgtaattt ttattaaatg 24000 
taattacaat taaataatta aggaatacaa 24060 
ttctgtgtga gaaatacgga aaaaagagat 24120 
atactataaa gaagatgaaa agtctattag 24180 
attttgttac ttttttgtat acataaatat 24240 
atcttttggt taattattac aataagtctc 24300 
taattggtaa ataaggaaaa aataataaaa 243 60 
aataaaaaaa atctttctgt aattatggct 24420 
gtttttgctg cagaaactac acaggtaaaa 244 80 
gaactagttt cgaaagttag agatttaatg 2454.0 
gggcaaccaa tatatgaaat aaaagttggt 24600 
aatatagatg aattagagaa attagtaaat 24660 
actataacag ataaagggca tataacaaat 24720 
gaaaaatatg aaaattcagc agacctttcc 24780 
aaaactgaaa ctaatggaat ttataaagtt 24840 
aaagataagt tagttataac tttaagagat 24900 
tatgtaggta ttggtgatga aaaagttgat 24960 
acaaatttag acccttctgc agaaggattt 25020 
gcaggagcta aaaatattga tgatgtccaa 250 80 
ctaaatacag tttcaccaca agatttatat 25140 
atggtagcaa atggtacatc aaagtcaatt 25200 
ggaaaatata aatttactat taagtatact 25260 
gtagagagta ctaatgaaaa agatttaaaa 25320 
aaggttaaat tgatagctgg agatgataga 253 80 
acaaaatata ctgacaacat agttatagtt 25440 
gctacaccac ttgctcaatc taaaaaagca 25500 
ccaaaagtaa ctttagatta tataaaagat 25560 
tatatagtag gtggagaatc agcagtatca 25620 
actaagaata ttgaaagact agctggagat 25680 
aaagctatgg gttcttttaa agatgcattt 25740 
atgagtatag ctgctaaagc tgctgaactt 25800 
gatctttcag cagacgctat caaattgatg 25860 
tctaacaatg tatctagcca aattgaaaat 25920 
gttcaaagag ttgaaggaga aacaagacac 25980 
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gatactaatg ctaaagttat agaaacatat tatggcaaat tagataaact atatatagca 26040 
aaagatggat atggaaataa tggtatgcta gtagatgcat tagcagcagg acctctagca 2 6100 
gcaggtaaag gtccaatact tctagctaaa actgatataa cagactcaca aaagaatgca 26160 
cttagtaaaa aattaaatct tggtgcagaa gtaactcaaa taggtaatgg agttga^ttg 26220 
acagtaatac aaaagatagc taaaatacta ggttggtaat aacttatact gtggagaaga 26280 
ttccttaatt aatacaggaa tctttttcta taattatgct atttattgta aattataaca 26340 
ttttataaag taggaggatg tagtagtggt agatgcaact ttatttttta caccacatca 26400 
agatgatgaa accttaagta tgggaagtgc tataattgaa catgttgaaa aaagtgatac 26460 
acatgtcata ttatgcacag atggtagtaa atctataata agaaaagttt tggatgatgg 2652 0 
agggtcatgt tcttatcata ttaaagatgt tcataaatac tctttatcag aaagtgaatt 26580 
ttcaaaagat agagatgaag aatttaagga tagctgtgaa gcaatgggtg taaaagaaag 26640 ' 
taatatacat atagaagata atagagcaca tgatggtgag ttaagtaaag aaaaggcaag 26700 
ggaaataata cttaagtact tagaagaata tccagatgca aaagtaaaaa ctgttactcc 26760 
atttaaagct agtggaatac atgaagacca tagagcatta ggagaagctg cattagagtt 26820 
atatagagaa ggcaaaataa aagatttaag attttatgta gaaccatacg actataagga 26880 
ttttaaaaaa gttaatccaa atgtagaagt atggaaggta ttacctagtc aagaggagaa 26940 
gttattaagt gcaatgaatg catataaaaa atggaatcct gaaagtggac attatgctat 27000 
aggatatcat tcagtaaaat ctcattttga tgagttagct acaaataaaa tacagtatgt 27060 
acatgctcca taaaatatta cgaaagaatt aggaggtaag aattaatgaa aatatcaaaa 27120 
aagatagtg;t ctttgttaac tatgacattt ttaactgtta cattatatgg aaatacatct 27180 
aatgcatcta caaaagatac attaacgggt tctggaagat gggaaacagc aataaaaata 27240 
agtcaagctg ggtggacaaa gtctgaaagt gctgtcttag taaatgacaa ttccatagca 27300 
gatgctttat cagctactcc atttgctaaa gcaaaagatg cgccaatatt attaactcaa 27360 
agtaataaat tggatagtag aacaaaagca gaattaaaaa gacttggtgt gaaaaatgta 27420 
tatttaatag gtggttcaat tgcattaagt tcagagattg aaaagcaatt aaatgcagaa 27480 
aatataaatt ttgaaagaat atccggaaat agtagatatg atacttcttt aaaattagct 27540 
gaaaagctag atagggaaaa gtctatatct aaaatagtag tagtaaatgg agaaaagggt 27600 
cttgctgatg cagtgagtgt tggagctata gctgctcaag aaaacatgcc aataatactt 27660 
tctgattcag agaatggaac tgaagtagct gataatttta tagatagtaa . agatatagca 27720 
aaatcgtatg taataggtgg tacatattct atttctaatt ctgtagaaag aagtttacca 27780 
aatgcaacaa gaatagcagg tagtagtaga agtgagacaa atgcaaaaat tatagaagaa 27840 
ttttataaag atactgatat aaaaaatatt tatgtaacaa aagatggtac aaagaataag 27900 
aatgatttaa tagattcttt agcagtgggt gtattagcag ctaaaaatag ttctccaata 27960 
gtattagcag gaaataagct tgatactact caaaaagatg ttttaaatac taagattata 2 8020* 
gataaagtta ctcaaattgg tggcttaggt aatgaaaatg ttgtagaaga tatactagat. 2 8080 • 
atacaagaag agactaagta tactgttgaa actattgatg aacttaatgc tgctataaaa 2 8140 
agagcagatg caaatgatat tataaagttt aaaccagaga aagaaaaaac tataaataat 2 82 00 
tcttttagta ttgaaactaa aaaaactgtt actattgaat tagatggaag atatagacaa 2 82 60 
acaataactt tggatatacc taatggaaaa tttaataact atgcagaaat agaaggtgga 2 8320 
gtaaagctaa aaaacataaa aaatgaatca ttagttaata aaggaagtat acaggattta 28380 
gatatatatg atgaaaatgg ttgtaaaata gagaatgaaa gttctggaga aatttggttt 2 8440 
gttactatag ttgaagaagc taatgatgta tatatagtta atagtggtga tataacaaaa 2 85 00 
atatcaaata attcttctag tactataata agaaattctg gaaatattga tacagttaca 2 8560 
ggaaaaaaag aacctgcaat aagtggaaat aaacctaaag tcaatgatac agaaaaagaa 2 8620 
actaaagctg ctagaggctt aaatcctagg gtagaagcat gttctgttcc taaaaaagat 28680 
tatgttatga taacaattcc taattcacct aaagattcaa gatataaaat ttactataga 2 8740 
gtagtttata ataaacctta tgcaatggat gttggtgata aaattaatat tggagaatgg 28800 
actgttgcac caacagatga agagccattt cttgaaaaag ctaaaaatgg ttgttatgta 28860 
gaagctgttg aggtaaatac ttcaacgaaa gaggtttcta gatggggaag aactaacgct 2 892 0 
acagatgatg gattttaaat attaaaatct tttaaaaaca aaaggtagac aattttttga 2 8980 
ttgtctactt tttattttta aaaattatca ttttactaaa caaataattt aaaactataa 29040 
caattatatt agtaaatact tttacaataa catcatttac aaaaagcaaa tcaatcatta 29100 
catacataat taacatttca acgacaccag aaagtagcct aaaactaaca aaagacaaaa 29160 
attccttaaa taagaacttt atttcaattc ttttactttc aaactcaaaa aacttattag 29220 
taacataagc aaataagaca gcaagtatcc atgccaaagc atttgcaacc ataaaattaa 29280 
ataaaataac tctagtaaaa tataaatacg aaactatatt tacaagagta gtaaatgctc 29340 
caaagaataa atataatatt gtctctttat gtttttttaa aatcaaatat aatcctccta 294 00 
agctaattta atataaatat acttaaagat actaacaaac attagtatag cacatctatt 294 60 
aataaatcaa tatatcttaa aatattataa aaataaaact ataatattct ataattaata 29520 
tgtactcata atccatattt tatcaaaata aaatatgacc ttaatttaga aatctgatat 29580 
aataaaaatt aaatacatat aagggggtaa acatgagaaa gtataaatca aaaaaattgt 29640 
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caaagctact agcattgtcg acagtttgct ttttaatagt gtcaacaata cctgtctcag 29700 
cagaaaacca taaaactcta gatggagtag aaactgcaga gtattctgaa agctatcttc 29760 
aatacctaga agatgtcaaa aatggagaca cagcaaaata taatggagta ataccttttc 2 9820 
cacatgaaat ggaaggtaca acacttcgaa ataagggtag aagtagtctt ccatcagcat 2 98 80 
ataaatcaag tgtagcatac aacccaatgg atcttggtct tactacacca gcaaaaaatc 29940 
aaggaagtct taatacatgc tggtcttttt caggtatgtc aactttagaa gcatatctta 3 0000 
aattaaaagg atatggaacg tatgacttat cagaagaaca tttaagatgg tgggcaactg 3 00 60 
gtggaaaata tggatggaac ttagatgata tgtcagggag ttcaaatgta acagccatag 30120 
gatatctgac agcatgggca ggtcctaagt tagaaaagga cataccatat aatcttaaat 3 0180 
ctgaggcaca aggtgcaact aaaccttcaa atatggatac tgcacctact caatttaatg 3 0240 
taacagatgt tgttcgtctt aataaagata aggaaactgt aaaaaatgct ataatgcaat 303 00 
atggttcagt gacatcaggt tatgcacatt attcaacata ttttaataaa gatgaaacag 3 03 60 
catataactg tacaaataag agggctccat taaatcacgc tgtagcgata gtaggatggg 30420 
atgataatta ttcaaaagat aactttgcat ctgatgtaaa accagaatca aatggagcat 3 0480 
ggttagtaaa aagtagttgg ggagagttta attctatgaa aggattcttc tggatttctt 30540 
atgaagataa aactctacta acagatacag ataactatgc aatgaaatca gtatcaaaac 30600 
cagatagtga taaaaaaatg taccaacttg aatatgctgg tcttagcaag ataatgtcaa 3 0660 
ataaagtaac agcagcaaat gtatttgatt ttagcagaga ttctgaaaaa cttgactctg 3 0720 
ttatgtttga aacagattct gtaggagcaa aatatgaagt atattatgca ccagtagtaa 3 07 80 
atggagttcc tcaaaacaat tcaatgacaa aacttgcaag tggaacagta tcatattctg 3 0 840 
gatacataaa tgtacctact aattcttaca gcttaccaaa aggtaaagga gcaatagtag 3 0900 
tagttataga caacacagca aatcctaata gagaaaaatc aactttagca tatgaaacta 3 0960 
atatagatgc atattattta tatgaggcta aagcaaactt aggtgaaagc tacatacttc 31020 
aaaacaataa gtttgaagac ataaatacat atagtgaatt ttctccttgt aactttgtta 31080' 
taaaagctat aacaaaaaca tcttctggac aagctacttc aggagaatct ttaactggag 31140 
cagatagata tgaaacagca gttaaagtta gtcaaaaagg atggacttct tcacaaaatg 31200 
cagtattagt aaatggagac gcaatagtgg atgctttaac agctacacca tttacagcag 31260 
caatcgactc tccaattctt ttgacaggaa aagataactt agattcaaaa actaaggcag 31320 
agttacaaag attaggaact aaaaaagttt atctaatagg . tggagaaaat tctttaagca 313 80 
agaatgtaca aactcaactt agtaatatgg gtatatcagt agaaagaatt tcaggtagtg 31440 
atagatacaa gaccagtata tctctagctc aaaagttaaa tagtataaaa tctgtttcac 31500 
aagttgcagt ggcaaatggt gtaaatggac ttgcagatgc aataagtgtt ggtgcagcag 31560 
ctgctgataa caatatgcca ataatactta ctaatgaaaa gagtgagttg caaggtgctg 31620 
atgagttttt aaattcatct aaaataacta agtcttatat aattggtggt * acagctactt- 31680 
tatcatcaaa tttagaaagt aagctttcaa atccaacaag acttgcagga • agtaatagaa 31740 
atgaaactaa tgctaagata atagataaat tctaccctag ctcagatttg aaatatgctt 31800 
ttgttgttaa agatggttca aagagtcaag gagacttgat agatggatta gcagtaggag 31860 
cattaggtgc taaaactgat tcaccagtag ttctagttgg aaataagtta gatgaaagtc 31920 
aaaaaaatgt acttaagtct aagaaaatag aaactcctat tagagttggt ggaaatggaa 31980 
atgaaagtgc ttttaatgaa ctaaatactc ttttaggaaa atagtgaatg gaaagatagt 32040 
aaggatatac attttattat taaaaaatgt tcaatagaag tttataaata taaaataaca 32100 
aaaaggggtg tctcaaaatg aactttttag ttcatgaggc actctttttt gttgttttat 32160 
atttttatca tttaaataat gaaaaatagg cttgtctata tttttagata gttttcttat 32220 
tattgatgaa aaatgtaata attttaaact gtaaaatagt gtaaagttat tttataatat 322 80 
aaatatagta agattttaaa caaaataagg gggaataata atgaaagcac caaaaactat 32340 
tttaacaata ctgacaatag cgcttacttt aagtagcatt tctataatac catcatatgc 324 00 
acttacagag gaaaaattaa taggtaatgg tagatatgaa actgctgtaa agataagtca 32460 
aaaggcatac agttctagtc ctaatgtagt gctagttaat gacaattcat tagctgatgc 32520 
actttcagct acaccatttg caaaggctaa aggagcacca atacttttga cagaaagtga 32580 
taagctagat gataggactg aaaaagagat aaaacgtcta ggagctaaag atatttattt 32640 
gataggaggt actgctgtac ttaataaaga tatagagaac aagttaaaag gtaatggact 32700 
taatgtagaa agaattaatg gtaagaatag atatgaaact tcactgattt tggctaataa 32760 
gcttaaggat ataaaagata ttaaggaagt tgctgtagta aatggtgaaa aaggattaag 32820 
tgatgctgtt agtgttggag caccagcagc tcaaaataag atgcctataa ttttgtctaa 32880 
ccctaaggac ggagtagaag catttgataa atttataagg gatgagaaag ttataaaagc 32940 
atatgtaatt ggaggtacaa attctgtttc aagagctgta gaaaagagtc ttcctaatgc 33000 
agaaagagtg agtgggaaag atagaaatga gactaatgct aaagttatag aaaagtttta 33060 
cactgataca aatttaagta atttatatgt aactaaggat ggtagtaaaa atgagaatca 33120 
actaatagac tcgttagctg ttggagtatt ggctgcaaaa aatgagtctc ctattgtttt 33180 
ggttggcaat aaattaaaca caaaacaaag agatatactt agtactaaaa aattaaatac 3 3240 
tataacacag gtaggtggaa atggaaatga agaagctttt gatgaaatta aaagtctaca 33300 
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agaaaaaact gtgtttgaag cgaaaactgt tgaagaatta actgatatga taaatatagc 333 60 
aagtcctaat gatattataa actttaaacc aaaggagaat actgtaaatg aagcttttag 33420 
aatggtaaca aataaaccaa taactgtgaa tataaagggt gattgttcaa agactcttac 334 80 
agtagatatg cctaatggtg aagtgaataa ctatgctact ttggtaaatg ttatagttag 33540 
aaatattgga gaaggtggat ttaataatca tgatacaatc actatattgt cagttcgtga 33 600 
caaaaatggt agagttatag aaaatactag aaattcagat atagatacgt tgatgatatt 33 660 
agcttctgct aatgatacaa aattgattaa tgatggatat ataggaaaac ttatagataa 33720 
ttcatctaat agtgatataa ctaataatgg tactatagat aaaaaagtta atcaggttga 33 780 
agatttagag gctaaagtag attcaattga aaaagctata gattctataa gtcaaaaagt 33 840 
caataaaata caagatatac tagataagct tggatttcta aagaaatttc ttagctaaga 33900 
atatatatat ttaagtgaag tgtgtgttgt aattgttcta aagttatgta gttaaacagg 33 960 
tttatataca aatagattta tagataaata tgtagtttag tatatatttg aataagcttt 34 020 
ggggaggtaa catggggatt ggaattgaat gtaacaaaaa aataaataaa agtattgaca 34080 
aaactagaat aaagaaaaat aatctaaata gcaatagcaa tagcaatagc aatagcaata 34140 
gcaatagcaa tagcaatagc aatagcaata gcaatagcaa tagcaatagc aatagcaata 34200 
gcaataaaat ttcaaataat gctgtagatt ctaaacttat gaatagtagg attaaatcaa 34260 
tagatataat aagagggcta tcaatagccc tcatgatagt ctgcaataat cctgggactt 34320 
ggatgagaat gtatcctcaa ttaagacatg ctgtatggca tggtgtaaca cttgcagatt 34380 
ttgcatttcc tttttttgtt atatctttag gtgttacaat tccaatatca ataaatagta 34440 
aactaaaaaa taataaatcc acactgagta taatacttag tatattcaag agaagtatct 34500 
tattgatatt atttggattt ttcttaaatt acctaggaaa tcctgattta gatactgtaa 34560 
gaatacttgg agtacttcaa cgtatgggat tagtctattt tgtgactagt ttagtatatt 34620 
tactactaaa aaaattaaat gttggaagta ctgctacaat aattactttt ctatgtatat 34680 
ctacatttat aatagttggg tactacatat tagcaaaacc ttatggattt gagctagaag 34740 
gttctcttgc acagttagtg gatttacatt tcttcaaggg acatttgtac- aagccagaat. 34800 
ttgaaccaga tggtttttta actagtatag tagcaatttc ttcaggtatg ttgggatgca 34860 
ctatgggatg tgtactttta aaagaagaca taggagaata .taaaaaattc .tttaaaatac 34920 
tggttatgag cataatcctg ctaataggtg cttttatctt taatcagtat tttccattta 34980 
ataaaagatt atggtctagc tcatttgtac ttttaatggc tggcagttat. ggtatattac 35040 
tgtcaatatt ttattttata tgcgacatca aaaataaaag taaaatattt accccaataa 35100 
tagcacttgg aagtagccct atttttacat atatgtgcct tgaaatatta agtcatgttt 35160 
tttggaatgt tccaaaactt acgaacaaag tcgattatcc aactaccttg " gtagagtgga 35220 
ctacatatga acttataaca ccttgggcag gtacaacttg ggattctctt atattttctt 35280 
tattgtatgt cctattctgg attatagtaa tgtctataat gtacaagaaa aaaatattta 35340; 
ttaaaattta gtgcaatatt tttagatgat aaagtctttg atttgttcaa atatacggaa 35400 
actatcaaat ataatcaaaa tttctcctcg ataaaattta aaaaccatgc cataaaagat - 35460 
tacaaaaagt ttacaatttg taattttttg tgctataatg aaaaaagatt tacataatat 35520 
tgaatttatg taatataaaa caagcttaag atgtattaaa aaagatttaa. agagttaaag 35580 
aaccaaaaaa tgagggggga tatgatgaaa aaaacaacaa aattattggc aacaggaatg. 35640 
ttgtctgtag caatggtggc accaaacgta gctttagcgg ctgaaaatac tactgcaaat 35700 
acagaatcta attcagatat taatataaat ttacaaagaa aaagtgtagt gttaggaagt 35760 
aaatctaatg caagtgtaaa atttaaagaa aaattaaatg ctgattcaat aacattaaat 35820 
. tttatgtgtt atgatatgcc acttgaagct actttaaatt acaatgaaaa gacagattca 35880 
tatgaaggtg taataaacta taacaaagac ccagaatact taaatgtatg ggaacttcaa 35940 
agcataaaaa taaatggtaa agatgaacag aaagttttaa acaaagaaga cttagagtct 3 6000 
atgggattaa atcttaagga ttatgatgtt acacaagagt ttataataag tgatgcaaat 3 6060 
tcaacaaaag cagtaaatga atatatgaga aaaacttptg caccagtaaa aaaacttgct 3 6120 
ggagctacta gatttgaaac tgctgttgag ataagtaaac agggttggaa agatgggtct 3 6180 
agtaaggttg taatagttaa tggagaatta gcagcagatg gaattacagc tactccactt 3 6240 
gcatctacat atgacgcacc aatattactt gcaaataaag atgatatacc agaaagtaca 36300 
aaagcagagt taaaacgtct aaatccaagt gatgtaataa ttataggtga tgatggttca 3 6360 
gtttcacaaa aggcagtgtc tcaaattaaa tctgctgtaa atgtaaatgt gactcgtatt 36420 
ggtggagttg atagacatga aacatcttta cttatagcaa aagaaataga taagtatcat 3 6480 
gatgttaata aaatatacat agcaaatgga tatgctggtg aatatgatgc tcttaatatt 3 6540 
tcatcaaaag ctggtgaaga ccaacaacct ataatattag ctaataaaga ttctgtacca 36600 
caaggtacat ataattggtt atcttcacaa ggtttagaag aagcttatta cataggtggt 3 6660 
agtcaaagtt tatcatcaaa aattatagac caaatatcta aaatagcaaa aaatggtact 36720 
tcaaaaaata gagtttctgg agcagataga catgaaacaa atgcaaatgt aattaagact 3 6780 
ttctatccag ataaagagtt gagtgcaatg ttagttgcaa aatctgacat aatagttgat 36840 
tcaataacag caggaccttt agcagctaaa ttaaaagctc caatacttat aactccaaaa 3 6900 
acttatgttt ctgcatacca tagtacaaac ttatcagaaa aaacagcaga aacagtatat 3 6960 
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caaattggtg atggaatgaa agatagtgtt ataaattcta tagcatctag tttatcaaaa 37020 
cacaatgcac caacagaacc agataattca ggttcagcag caggaaaaac tgtagtgata 37080 
gacccaggtc atgggggttc tgattcaggg gcaacaagtg gattaaatgg tggagcacaa 3714 0 
gaaaagaaat atactcttaa tacagcatta gcaactactg aatatttacg ttcaaaaggt 37200 
ataaatgttg ttatgactag agatacagat aaaacaatgg ctttaggaga aagaacagca 37260 
ttatcaaata ctataaagcc agatttattt acaagtatac attacaatgc gtctaatgga 37320 
tctggtaatg gtgtggaaat ttattacaaa gtaaaagata aaaatggtgg aacaactaag 373 80 
actgcagcat caaatatctt aaaaagaata ctagaaaaat ttaacatgaa aaatagagga 37440 
atcaagacaa gaacacttga taatggaaaa gattacctat atgtgttaag aaataacaat 37500 
tatccagcaa tacttgttga atgtgcattt attgacaata agagcgatat ggataagtta 37560 
aatactgcag aaaaagtaaa aacaatggga actcaaattg gaataggaat agaagataca 37620 
gtaaaataac atagataata ttaaagcaaa acatatcatc tcctgttata atgtaaaata 37680 
taacaagaga tgatatattt ttgtaaagaa tcaaattgtt tacaatgatg tttcttgtat 37740 
tgataatctg c teat tat tg agtataatat agaaataagt caaaaataag atagaaataa 37800 
gttgaaatta agaattaaaa aaatttttat tccaatatgg aggtaactat gtctggatat 37860 
actaatgatg agtgtgaaat acctaaaatt aaaagctatc caggggcaga taaagaaatt 3792 0 
gcaagtgaga ttgactatag catagtaaaa ggaacagttt tattcgactt gtaccaaaga 37 980 
attatggatt tagttttatc tataatagga cttgtgatag gtttaccttt aatagcaata 38040 
tttggaattc ttattaaaat agaggacaaa ggaccaataa cttataagca agaaagacta 3 8100 
ggaaaatgcg ggagaagatt ctatatatat aaattgagat ctatgagaac agaegcagaa 38160 
aaatttggcg cacaatgggc tgaaaaagat gatcctagaa taactaaggt tggaaagttt 38220 
attaggaaaa ctagaattga tgagattcca cagctgttta atatattaaa gggggacatg 38280 
ggcttaatag gtcctagacc agagagaccg aactttactg ttcaatttaa tgaggaaata 38340 
cctggattta tcaacagact cgctataaaa cctgggctta caggttgggc acaagtaaat- 38400 
ggtggatacg aaatcactcc tgaagaaaaa cttaaagaag atatttacta tataaaaaat 3 8460 
aggagtatat tacttgactt taagatactc tttaaaactg ttaaagtagt attgacagga 38520 
gaeggagcta gataatttta aaatattaga ttactaatca atgacaaagc tctctaaaaa 38580 
gagctttttt attgtcaaaa aatacaaaaa agaaaaggaa tgagaactta tgattaaaaa 3 8640 
aatctctact atattatcgt tagttttgtt aatctctatt tctagtacga- taggagtatt 3 8700 
tgeagatgea aatcccaaaa gagagttgat agaagggagt ataccagaga tttcgacaga 38760 
attaaataaa agagcattta aggattctaa agaagtaatt cttgtaaatg aagagtcaat 38820 
tgtagatagt ataagtgcaa cacctttagc atattcaaaa aatgctccaa tagttgtaac 38880 
taagagtaaa aatcttggta gagtaacaag aaattatcta aaagaactag gtccagagaa 38940 
ggttacgata gtaggtggtt taaaggcagt atctaaagat getgaaegta atatagaaaa 3 9000 
aatgggaatg aaagtagaaa gaataagagg gaaagacaga tatgatacat ctttaaagat 3 9060 
agctagagaa atgtatagaa cagtaggatt tgatgaagca tttttactta gctcaacaac 39120 
aggattagag aacgecatat ctgtatattc ttatgecget aaaagtggaa tgccaataat 3 9180 
atgggctaaa gatgaagggt ttgaggaaca aattgatttc ttaaaaggca agaatcttaa 3 9240 
aaaaatatat gcattaggtg attcaaaaga atttattgea gagattgatt ctaacttaaa 3 93 00 
aaatatagaa ggaataaagc aaataaataa gtcaagtaca aatgttgatt taataaagaa 3 93 60 
gttttatgat gaaaaagata taaagaaaat etatactget agattagact ttggtagtag 3 9420 
atctgatgta aacgaatata tttctttggg agtagtatca gcaaaagaaa atatgecaat 394 80 
actgatttgt agtgataact taagtcgtgc acaagataaa ttcttgaaag atagcaatat 39540 
aaatgatgta gttgaagttg ggtatacagt aggtgattat agtctattta agagcatatt 3 9600 
taatctaaca ttcttatctt gtatagtatt aatactatta ttactattga ttacatttag 39660 
ggcacttaga tatgagtcga aatagttttt gatttataag gaagcttttc ataggttaaa 3 9720 
tttacttaat tagaataatg tttataagca gttttaaaag ttctactttt agagtttaaa 3 9780 
ttagataagg agaaagaaga tgtcaaaaac agctaaagca gcattgtgga ttatggcagc 3 9840 
cactatgttt tcaaaagtat taggtttcct tagagaactt gtacttgcaa atttctatgg 3 9900 
gacaggtatg tatgeagacg tatttgtatt aacattaaat atacctggtt taatcatagc 3 9960 
agtcataggc tctgcagttg ctacaactta tatacctatg tactttgaga caaagaaaag 40 020 
acteggagat gaaggtgcct taaaatttac taataatgtt ttaaatatat gctatataat 40080 
ggctatagta atagctatta ttggtctttt atttacagag caatttgtta cagtatttgc 40140 
agcaggattt agaaacgacc ctgctaagtt ccaagcagca atattattta ctaaaataat 402 00 
gatttcagga gttttattcc tgagtggaag taagatattt agttcgtttt tacaggtgaa 40260 
tgacagtttt gtaattcctg ggcttatagg aatcccatat aatattataa taatagcagc 40320 
aatagcacta agtgcaggta aaaatgtatg gatactgeca gcaggagcat tattggctat 403 80 
ggcgagtcag ctattgtttc aactgecatt tgcgtttaag aagtcctaca agtataagee 40440 
atatataaat ttaaaggatg agtctataaa agagttagtc aacctagtat tgcctatgtt 40500 
ggtaggagta gctgttggac agttaaatat atttgttgac cgattattgg caaccacttt 40560 
gggtgatgga aagttgtctg ctcttaacta tgcaaatagg ctaaatgagt ttgttatggc 40620 
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attatttgtt acatctataa taacagttat atatcctaaa ctagcaaaaa tgtctggtaa 40680 
agataataaa gaaggattta taagcactat tgtaaaatct tcaaattgta ttatattggt 40740 
ggtgcttcct atatctatag gagctattat attagcagag cctcttgtta gaatattatt 40800 
ccaaaggggt aaatttgatg cgttatcaac agaccttaca tctatagcgc ttagattata 40860 
ttcactaggt ttattggctt gtggagtaag agatgtctta tatagggcat tttactctct 40920 
ttcagataca aaaacgccaa tgataaatgg aagtatagct ttaataataa atatagttct 40980 
taatttaata ttgataagac cattgggtca tgcaggtatt gctatatcaa ctagtacatc 41040 
aaacataatt acagttatat tattatttat atctcttaaa aagaaaaatg gatattttgg 41100 
aggggataaa attataaaga ctggtctaaa aagtttggta gcatctggtg tgatggcagt 41160 
agctacttta ttgatatata ataatctata tgcatttatg ggttcaggaa caattaaaga 41220 
aattatatca gttggagctg gagtattagg tggagcttct gtatatactg tattgatagt 412 80 
tttatttaag gtagaagaga tggatttagc ttttgaattt ttaagaaaag gaaaacaaaa 41340 
actacttaga agatagataa ttaccttgaa afctttaaatt agagtcagaa aggatgaaaa 41400 
tatggattat aagaataatt atgagatgtg gttaaattct ccttattttg atgaacaaac 41460 
gaaaaatgag ctattaagta taaaagatga tgaaaaagaa atacaagata gattttataa 41520 
aaatttagag tttggaacag gtggtttaag aggtataata ggagcgggaa ctaatagaat 41580 
caatatttat actgttagac gtgcgacact gggcgtttta aattatatca tgaaaactca 41640 
aggagaagaa ggtaagcaaa agggaatagt tatagcacat gatagtagat acatgtctag 41700 
ggagttttgt atagaagtag caaagacttt aagtgcttat ggtgtaaagg cttatatttt 41760 
tgaagaatta aagcctacac cagaactttc ttttgctgtt agatatttaa agtgtgctat 41820 
gggaatagta ataacagcaa gtcataatcc taaagaatac aatggatata aagtatatga 41880 
ctcagatggt ggtcaaatct gtatagatat ggcaaatgat ataatagcag aagttaataa 41940 
gattgatgat tacagtacta ttaaaagtat agattttgaa gaagctttat ctaaaaattt 42 000 
aataactata ttagataatg aagtagatga tgagtttata aaagctgtta agaaacaagt 42060 
tttaagacaa aatataatag atgaatacgg gaaaaaatta aaaataatat atacacctat 42120 
tcatggtaca ggaaataagc cagtaagaaa agtcttaaat gaatgtggtt ttgaaaatgt 42180 
aatggtagtt aaggagcaag aattgccaga- ttctaatttt tcaacagtaa agtatcctaa 42240 
tcctgaagaa aagtcagtgt ttaatatagc tattgaaatg gcaaagaata atggcactga 423 00 
tttgataata ggaacagacc ctgattgtga tagagtaggt atcgttgtaa- aagattcaag 42360 
tggtgaatat gtagtcttaa atggaaatca agtcggttca cttttagtta gatatatatt 42420 
agagagttta gttgaagaaa ataaacttcc taaaaacaat cctacaataa taaaaactat 42480 
agttacatca gaactaggtg caaaaatagc aaaggcttat aatgtagatt gtttaaatac 42540 
tttaacagga tttaaattta taggagaaaa aataaaagcg tttgaagaaa gcaatgatag 42600 
aagttttata atgggatatg aggaaagcta tggttactta attggaactc atgctagaga 42660 
taaggatggt gttgtatctt cacttatgat atgtgaaatg gcagcatact- attcttcaaa 42720 
aggtatgaac ttgtatgaag ctttgataga tacatataat aaatttggat actacaaaga 42780 
ggacttaaag tcagtaactt taaaaggcat agatggaata aaaaaaataa aagagatgat 42 840 
gctttacttt agaagtgtta aaatagataa tgtggctgat gtaaaagtag ataaaatttt 42900 
agactataaa gatggtgttg atgatttacc taagtcagat gtacttaaat ttttacttga 42960 
agatggttct tggatagcta taagaccttc aggtacagaa cctaaaatta aattttactt 43 02 0 
tggagcaaat tcagacaatc aagaagatgt agaatttaag ttgaacaatt taatatctta 43080 
tatattaaat gtagtggatt ctatataaaa agctaagggg agataaacaa tgaaagtata 43140 
caatgtcata atggctggag gaggaggaac tcgtttctgg ccacttagta gacaagaagt 43200 
acctaaacag cttataaatt taagtggaga agatgcttta ataaacgaaa ctataaacag 432 60 
aatagattct ctagcaaaaa aagatgattt atttatagta accaatgaga agcaattgga 43320 
agctttaaag gatatagtaa aagataaatg tttggatagt aatatattgc cagagccatg 433 80 
tgccagaaat acagcagcag ctataggttt tgcagcattt aacataatga aaaaatatgg 43440 
tgatggtgtt atgtgtgtat atccagctga ccattatata aaagatgaaa aagaatttaa 43500 
atctattttg gaaaaagcta tctatatagc agaaaataat gataaactag taactattgg 43560 
gataactcct acatttccat caacaggata tggatacata aattttaata gagaaaatac 43620 
tatagaagat gttgcatatg aggtagtgga gtttgtagaa aaaccaaact atgaaattgc 43680 
taaagaatat gtaaactcta aaaagtatgt atggaatagt ggaatgttcg tatggaaggt 43740 
atctaaaata ctagaagatt ttaagagata cttgccaaaa gtatatgaaa aattagaaga 43 800 
tataagtaaa tacttaggaa caaaagaaga aatggaaaaa ataaaagaaa tatacccaac 43860 
tatacaatca atctctatag attatggtat tatggaaaga agtaatgatg ttatagtagt 43920 
tcctggtgat tttggatgga atgatgttgg ttcttgggat tctctaggtg caatatatcc 43 980 
aactgatgat gaaggtaata tcaaaagagg tgaaaatatt actatagata ctaaaaattc 44 040 
aattatatat tcagatgaca aattaatatc gacaataggt attagtgatt tgatagtagt 44100 
atctacaaat gatgcagtta tggtttgtag aaaagataaa gcctcaagat gtcataaaaa 44160 
tagtagagca attaaaagaa gaagatagac aagaatatat gtaggaggtg ttcatttgat 44220 
tcctaaaaaa atacattatg tatggtttgg tggtccaaaa ggaaatatag aaaatatatg 442 80 
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tataaattct tggaaagaaa agcttccaga atatgagata gtagagtgga atgaaaaaaa 44340 
ctttgatata gagaaagaaa taaaaggaaa taagtttttg gaagaatgtt ataaaagaaa 44400 
actttgggca ttcatatctg attatacaag aataaaagta ctctatgaac aaggtggagt 44460 
ttatatggat acagatatgc aaatattaaa ggatataact cctcttttag aaaataatag 44520 
gctaatctgt gggtatgaag atgatagaga gtatataaat ggtgccataa taggtgtaga 44580 
aaaaggacat ccttttttaa aagacttatt ggagtattat gaaaaagaag tacttacttc 4464 0 
ttcattgttt acaataccaa agataatgac tcatcttatg gaaaagaatt ataagaagat 44700 
tgacccaaat aattatgaag aaggaatacg tgtttacgac aaagaatatt tttatccttt 44760 
tggatttaaa gaagacttta caccagaatg tataacagaa aacacttttg ggatacattg 4482 0 
gtggggaaaa agttgggcta aaaagagaaa ttatttctta gaaagcaaac acttaacagg 44 880 
tgtaaataag atatggaagt gttgtaaaat atttgcaagt aatactttac gaagctaaaa 44940 
gggagaaagt aaaatgaaga ctttaaatga tgatagaaat attatgatta gaaaatatct 45000 
ttatatattt gttttgttta tcatattact tagaaagtat atgatatcat ttattgaccc 45060 
aaatatggat atagggatga ttaaatcagc gttattttat tcttcaatag gaatactaat 45120 
gctacttttt ttatttgata agagaaaaag tataacagaa atggtactag ttggagtttg 45180 
tgtcttactt tatttattaa atagggaagg agctatatta ttaatcgttt tattagctgt 45240 
atcagctaag caaatagatg ataaatttat agttaaaaat tatttaataa tatcagcatg 45300 
ttttttaatg gtagctatat tattatttaa tctatttcct tcattgatat ttaatcagga 45360 
agtccactat agatatatag aaaaaataga tatgcttgtt acaagaatgg attttggact 4542 0 
tggtaatcca aatagtgtat attatcatat ggttactata tatgcagcat atatattctt 45480. 
aagatataaa gattataata aatgggatag aataatttta tttggttctg cattttttgt 45540 
ttatcaaacc acatatagta gaactggatt tttcactata ttagcaggat taatatttgt 45600 
agaaataata agatggatag atataaaaaa aataaaagga ctaccaatgc tcctaaagac 45660 
attaccgata atacttactt tatttagtgt aattattggt acagtatttg ataaaagcac 45720 
tttattaaat agattattag caagtagacc taagttctgg catgtatact • tggctgaaaa 45780 
ggggaacttt ctaaaacctt ttggaaattc atatagtcct gctataaaag caactaatcc 45 840 
attagatagt tcttatgtat atataattag tgtactagga gtagttgctt .gtatattatt 45900 
tatgtatttg atgtataaag gaatagaaag ctttatagaa aaagataaaa aagcatattt 45 960 
agtagcttta tttatatttt tactatactc ttttgcagaa aatttattat tagaagcttc .46020 
atttagtttt gcagtagtat tactaataaa agaagtgata ttaaatgata aaagggaaat 46080 
agatttgtgg aagatgaaaa gtaggaggta gtaatgttaa tatcattaat aatgccaaca- 46140 
ttaaatagat atgatgatat atacttactt atggacagct tagaaaatca aacttacaaa 4 6200 
aactttgaac ttatagtagt agaccaaaat gataatagca aagttaaaga aatagttgat 46260 
aaatacattg ataagttaga tataaagtat ataaaaagct ctaaaaaagg attgagttat 46320 
aacagaaatg taggcataga caatgctgta gggcaaatta taggtttccc tgatgatgat 46380 
tgtgtctacg aaaatgatac attagaaaaa gttataaatt tctttaataa aaataaagat 46440 . 
tataaaatat acagttgcaa aactatggat tcaaataaag ttgatgcatf taagaaaatg 46500' 
tatgatggaa cttgtgatat cacaagttcc aacgttttgg atactataac atccataact 46560 
ttttttattg attttgaagg taaagactat acaagatttg atgagaaatt gggtgttggt 46620 
ggagagtttg gtgctggtga agagatagat tatgttttaa acttattgag tttaggtttt 46680 
aaaggaaaat actttggaaa tgatataatt taccatcctg caaaaaaaca ttctaaatct 46740 
aaagaaaagt atcaaaaaga ttataactat ggaagaggct ttggagcgct ttgcaaaaag 46800. 
gaaatagtct atagaaaaaa ttataagttt gcaaaagtta tggtatcaaa acttgtaaga 46860 
aatataggag gtcttatatt gagttccaat agagattatc atagtgccac tataaaagga 46920 
agaattaatg gatttagaca gtacaagtta taatgaggat ggtgtttatg gacaaggtta 46980 
aatttttaat aaatatgctt ttggcttttt ttatatatcc atttaataag cataagttta 47 040 
atggtagaga tatctggctt ataggtggac attcagggga tatatacaat gataattcaa 47100 
agttttttta tgagtatatg ctaaaagaac ataatgatgt agaaacttat tgggtagtaa 47160 
ataaagacag taaagtattt gataaaattc caggcaaaaa attaataaga ggaagtgtag 47220 
aaaactatct gtattattac aattcaaaag ctatagtatt ttcacatgca ccatcagcag 472 80 
atatagcacc atacaatttt gctgtacctg ttcttaatta ttttcataaa aaaacgataa 47340 
aagtgttttt aaatcatggg actataagtt ttaaaaaaag aaaacctatg aacaaaaagt 474 00 
tcaaaaatat tatagataac ctatataaaa gctataatat agttacagca agttctgaat 47460 
ttgaaagaaa tgttatggta aatgattggg gaatgttaga tgactcagtt tatataattg 47520 
gaaatgcaag atatgacaat ttaccaacaa atgaagttgc acaaactcga gatatattat 47580 
atactcctac atggagagat tggataaaat ttagttcagg aaaatttacg gatacagatt 47640 
attttaaaaa tataatgaat tttttaaatg atgacaaatt aaataaaata ttagatgaga 47700 
aagatataaa tgtaaaaatt tatatgcatc atcttatgca tgagtttata gatgatataa 47760 
aagaaaatat aacaggaaaa cgaattgtgt ttttagataa aggggtaact ctagcaaatg 47820 
aaatcagaaa atcagctgca aatataactg attattcaag tgtagccata gactttttgt 47880 
atatgaatag acctattttg ttttaccaat tcgatttgga tgaatatatg gaaaaagttg 47940 
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attcatatat agatttaaaa agtgaaatgt ttgggtcttt ggcttataat aatgatgaag 48000 

ccgtgaataa acttattgat attattgaaa ataattttga ggttatggat aaccaaaaaa 48060 

atgaaagaaa taagtttttc agatataatg ataataagaa ctgcaagagg atatatgatt 48120 

gtgtattaag taaaattaaa taaatatttt ataaatgatg aggaaggtaa tatgaagaag 4818T) 

aatttagtat ctataattac tcccatgtat aattctgaaa aatttattga agcaaccata 48240 

aaatcagtat taaaccaaac ttatcaagaa tgggaaatgt taattattga tgattgctca 48300 

acagataata gtcctaatat agtcaaatct tatatgcaac aggatagtag aataaaatgt 48360 

ataaagactg agactaataa gggtgtctct aatgctagaa atttagcact aagtaaggca 4842 0 

acaggacaat ttatagcttt tttangtagt gatgaccaat ggaatagtag taagttagaa 48480 

aaacaagtaa attttatgtt agaaaatgac tatgtaattt catttacttc atatgaactg 48540 

atggatgaaa a 48551 



m 


INTERNATIONAL SEARCH REPORT 


International application No. 






PCT/SE 01/01280 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC7: C12N 15/63, A61K 39/08 

According to International Patent Classification (IPC) or to both national classification and IPC 



B. FIELDS SEARCHED : 

Minimum documentation searched (classification system followed by classification symbols) 

IPC7: A61K, C12N 

Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 

SE,DK,FI,N0 classes as above 

Electronic data base consulted during the international search (name of data base and, where practicable, search terms used) 



BIOSIS. WPI DATA, PAJ, EPO-INTERNAL 



C DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 


Citation of document, with indication, where appropriate, of the relevant passages 


Relevant to claim No. 


Y 


W0 9519371 A2 (S0LVAY), 20 July 1995 (20.07.95), 
page 12, line 1 - line 9, claim 12, 
abstract 


1-33 


Y 


Microbial Pathogenesis, Volume 28, 2000, 

Marina Cerquetti et al, "Characterization of 
surface layer proteins from different Clostridium 
difficile clinical isolates" page 363 - page 372 


1-33 


P,Y 


Molecular Microbiology, Volume 40, No 5, 2001, 

Ernanuela Calabi et al , "Molecular characterization 
of the surface layer proteins from Clostridium 
difficile" page 1187 - page 1199 


1-33 


X Further documents are listed in the continuation of Box C. j )(| See patent family annex. 


* Special categories of cited documents: 

* A" document defining the general state of the art which is not conadered 

to be of particular relevance 
*E" earlier application or patent but published on or after the international 
filing date 

"V document which may throw doubts on priority claim(s) or which is 
cited to establish the publication date of another citation or other 
special reason (as specified) 

"0" document referring to an oral disclosure, use, exhibition or other 
means 

document published prior to the international filing date but later than 
the priority date claimed 


T* later document published after the international filing date or priority 
date and not in conflict with the application but cited to understand 
the principle or theory underlying the invention 

"X* document of particular relevance the claimed invention cannot be 
considered novel or cannot be considered to involve an inventive 
step when the document is taken alone 

"Y* document of particular relevance: the claimed invention cannot be 
considered to involve an inventive step when the document is 
combined with one or more other such documents, such combination 
being obvious to a person skilled in the art 

document member of the same patent family 


Date of the actual completion of the international search 
28 Sept 2001 


Date of mailing of the international search report 

0 2 -10- 2001 


Name and mailing address of the ISA/ 
Swedish Patent Office 
Box 5055. S-102 42 STOCKHOLM 
-Facsimile No. + 46 8 666 02 86 


Authorized officer 

Carolina Gomez Lagerlof/BS 

Telephone No. + 46 8 782 25 00 



Form PCT/ISA/210 (second sheet) (July 1998) 



INTERNATIONAL SEARCH REPORT 



International application No. 

PCT/SE 01/01280 



C (Continuation). DOCUMENTS CONSIDERED TO BE RELEVANT 



Category* 



P,Y 



Citation of document, with indication, where appropriate, of the relevant passages 



Infection and Immunity, Volume 69, No 5, May 2001, 
Tuomo Karjalainen et al, "Molecular and Genomic 
Analysis of Genes Encoding Surface Anchored 
Proteins from Clostridium difficile" 
page 3442 - page 3446 



Relevant to claim No. 



1-33 



Form PCT/1SA/710 (continuation of second sheet) (July 1998) 



INTERNATIONAL SEARCH REPORT 



International application No. 
PCT/SE01/01280 



Box I Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 

This international search report has not been established in respect of certain claims under Article 1 7(2)(a) for the following reasons: 

l - |3 Claims Nos.: 30-32 

because they relate to subject matter not required to be searched by this Authority, namely: 

see next sheet 



2, PI Claims Nos.: 

because they relate to parts of the international application that do not comply with the prescribed requirements to such 
an extent that no meaningful international search can be carried out, specifically: 



3. ri Claims Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 



1 . Fl As all required additional search fees were timely paid by the applicant, this international search report covers all 

searchable claims. 

2. P] As all searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 

of any additional fee. 

3. | | As only some of the required additional search fees were timely paid by the applicant, this international search report 

covers only those claims for which fees were paid, specifically claims Nos.: 



4. pi No required additional search fees were timely paid by the applicant Consequently, this international search report is 
restricted to the invention first' mentioned in the claims; it is covered by claims Nos.: 



Remark on Protest The additional search fees were accompanied by the applicant's protest 

[""J No protest accompanied the payment of additional search fees. 



Form PCT/ISA/210 (continuation of first sheet(l)) (Julyi998) 



INTERNATIONAL SEARCH REPORT 



International application No. 
PCT/SE01/01280 



Claims 30-32 relate to methods of treatment of the human or 
animal body by surgery or by therapy (PCT Rule 39.1(iv)). 
Nevertheless, a search has been executed for these claims . The 
search has been based on the gene expression cassette 
according to claims 1-10. 



Form PCT/ISA/210 (extra sheet) (July 1998) 



INTERNATIONAL SEARCH REPORT 

Information on patent family members 



International application No. 

PCT/SE 01/01280 



Patent document 
cited in search report 



Publication 
date 



Patent family 
member(s) 



Publication 
date 



wo 



9519371 A2 20/07/95 



AU 6380194 A 24/10/94 

EP 0691845 A 17/01/96 

EP 0738278 A 23/10/96 

GB 2291594 A,B 31/01/96 

GB 9400650 D 00/00/00 

GB 9519866 D 00/00/00 

JP 8508474 T 10/09/96 

JP 9508012 T 19/08/97 

US 5874267 A 23/02/99 

US 6028098 A 22/02/00 



Form PCT/ISA/210 (patent family annex) (July 1998) 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 



Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 



Isd LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 



IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



BEST AVAILABLE IMAGES 




ADED TEXT OR DRAWING 



□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 



□ SKEWED/SLANTED IMAGES 



□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 




